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METHODS AND COMEOSITIONS FOR DETECTING CANCERS 

Cross-Reference to Related Applications 

This application claims the benefit of priority of U.S. Provisional Application No. 
60/386,653 filed June 5, 2002, the specification of which is incorporated by reference herein in 
its entirety. • 

JFnnding 

Work described herein was supported by National Institutes of Health Grant ROICA 
67409. The United States Government has certain rights in the invention. 

Background 

In 2001, over 1.2 million new cases of human c ancer will be diagnosed and over 0.5 
million people will die from cancer (American C ancer Society estimate). Despite this, more 
people than ever are living with and surviving cancer. In 1997, for example, approximately 8.9 
million hving Americans had a history of cancer (National Cancer Institute estimate). People 
are more likely to survive cancer if the disease is diagnosed at an early stage of development, 
since treatment at that time is more likely to be successful. Early detection depends upon 
availability of high-quality methods. Such methods are also useful for determining patient 
prognosis, selecting therapy, monitoring response to therapy and selecting patients for additional 
therapy. Consequently, there is a need for cancer diagnostic methods that are specific, accurate, 
minimally invasive, technically simple and inexpensive. 

Colorectal cancer (cancer of the colon or rectum) is one particularly important type of 
human cancer. Colorectal cancer is the second most common cause of cancer mortality in adult 
Americans (Landis, et al., 1999, CA Cancer J Clin, 49:8-31). Approximately 40% of individuals 
with colorectal cancer die. In 2001, it is estimated that.there will be 135,400 new cases of 
colorectal cancer (98,200 cases of colon and 37,200 cases of rectal cancer) and 56,700 deaths 
(48,000 colon cancer and 8,800 rectal cancer deaths) fi^m the disease (American Cancer 
Society). As with other cancers, these rates can be decreased by improved methods for 
diagnosis. Although methods for detecting colon cancer exist, the methods are not ideal. 
Digital rectal exams (i.e., manual probing of rectum by a physician), for example, although 
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relatively inexpensive, are unpleasant and can be inaccurate. Fecal occult blood testing (i.e., 
detection of blood in stool) is nonspecific because blood in the stool has multiple causes. 
Colonoscopy and sigmoidoscopy (i.e., direct examination of the colon with a flexible viewing 
instrument) are both uncomfortable for the patient and expensive. Double-contrast barium 
enema (i.e., taking X-rays of baiium-fiUed colon) is also an expensive procedure, usually 
performed by a radiologist. 

Other cancers such as breast cancer, thyroid cancer and stomach cancer, cause significant 
pubUc health problem as well. For example, thyroid cancer is the most common endocrine 
malignancy. In the United States, there are approximately 14,000 new patients and 1,100 deaths 
per year (Shah et al, 1995, CA Cancer J Clin 45:352-68). Because of the disadvantages of 
existmg methods for detectmg and treating cancer, new methods and tools in cancer diagnosis 
and cancer therapy are needed. 

Summary of the Invention 

In accordance with the present invention, new diagnostic tools and methods for detecting 
cancer (e.g., colon cancer, breast cancer, thyroid cancer, or stomach cancer) are provided. In 
certain aspects, the invention is based in part on the discovery of a novel polynucleotide 
sequence encoding a novel sodium/solute symporter-like protein (SLC5A8). Applicants 
previously referred to the SLC5A8 gene as the "Hull" gene. 

In one embodiment, the invention provides an isolated polypeptide comprising an amino 
acid sequence selected fi-om the group consisting of: a) an amino acid sequence at least 95% 
idratical to SEQ ID NO: 1; and b) an amino acid sequence encoded by a nucleic acid that 
hybridizes under high stringracy conditions to a nucleic acid of any one of SEQ ID NOs: 3 or 4, 
whCTcin said polypeptide is a cell surface protein. The subject polypeptide comprises a 
transmembrane domain as set forth in any one of SEQ K) NOs: 19-31. The present mvention 
contemplates the subject polypeptide as a sodium symporter. 

In another embodiment, the invention provides an isolated antibody or fragment thereof, 
which is specifically immunoreactive with an epitope of a SCL5A8 protein sequence as set forth 
in SEQ ID NO: 1. The antibody of the invention can be selected fi-om the group consisting of: a 
polyclonal antibody, a monoclonal antibody, an Fab fragment and a single chain antibody. 
Optionally, the antibody is labeled with a detectable label. 
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In another embodiment, the invention provides an isolated SCL5A8 nucleic acid selected 
ftom the group consisting of: a) a nucleic acid conpising the nucleotide sequence of SEQ ID 
NO: 2, or a complement thereof; b) a nucleic acid molecule that encodes a polypeptide 
conqjrising the amino acid sequence at least 95% identical to the amino acid sequence of SEQ 
ID NO: 7; and c) a nucleic acid molecule that hybridizes under stringent conditions to SEQ ID 
NO: 2 . 0 ptionally, the nucleic acid o f the invention further comprises a v ector nucleic add 
sequence. In certain embodiments, the invention provides a kit comprising the SLC5A8 nucleic 
acid probes or primers and instractions for use. 

In another embodiment, the invention provides a host cell which contains the subject 
SCL5A8 nucleic acid of the invention. In another embodiment, the invention provides a method 
for producing the subject polypeptide, comprising culturing the host ceU under conditions in 
which the subject nucleic acid molecule is expressed. 

In another embodiment, the invention provides a method for detecting the presence of 
the subject SCL5A8 polypeptide in a sample, comprising: a) contacting the sample wilii an 
antibody which selectively binds to the polypeptide of claim 1; and b) detemiining whether the 
antibody binds to the polypeptide in the sample. 

In another embodiment, the invention provides a kit for detecting a human SCL5A8 
polypeptide comprising: (i) an antibody of claim 2; and (ii) a detectable label for detecting said 
antibody. 

In another embodiment, the invention provides a method for detecting the presence of 
the SCL5A8 nucleic acid in a sample, comprising: a) contacting the sample with an SCL5A8 
probe or primer, and b) detennining whether the probe or primer buids to a SCL5A8 nucleic 
acid in the sanq)le. 

In another embodiment, the invention provides a method for identifying a compound 
which binds to the SCL5A8 polypeptide, comprising: a) contacting the p olypeptide, or a cell 
expressing the SCL5A8 polypeptide, with a test compound; andb) determining whether the 
polypeptide binds to the test compound. 

In another embodiment, the invention provides a method for modulating the activity of 
the SCL5A8 polypeptide, comprising contacting the polypeptide or a cell expressing the 



-3- 



wo 03/104427 PCT/US03/18239 

polypeptide with a compound which binds to the polypeptide in a sufficient concentration to 
modulate the activity of the polypeptide. 

In another embodiment, the invention provides a method of inhibiting abenant activity of 
a SLC5A8-expressing ceU, comprising contacting the cell with a compound that modulates Ihe 
activity or expression of the polypeptide, in an amount which is effective to reduce or inhibit the 
aberrant activity of the cell. 

In certain embodiments, compounds used in the methods of the invention are selected 
from the group consisting of a peptide, a phosphopeptide, a small or^c molecule, an antibody, 
and a peptidomimetic. Cells in the methods of the invention can be found in the colon, kidney, 
lung, esophagus, small bowel, stomach, thyroid, uterus, and breast 

In another embodiment, the invention provides a method of treating or preventing a 
disorder characterized by aberrant activity of a SLC5A8-expressing cell, in a subject, comprising 
administering to the subject an effective amount of a compound that modulates the activity or 
expression of the SLC5A8 polypeptide, such that the aberrant activity of flie SLC5A8- 
e}q)ressing cell is reduced or inhibited. 

In another embodimoit, the invention provides a transgenic mouse having germline and 
somatic cells comprising a chromosomally incorporated transgene that disrupts the genomic 
SLC5A8 gene and inhibits expression of said gene, wherein said disruption con^nises insertion 
of a selectable marker sequence resulting in said transgenic mouse exhibiting increased 
susceptibility to the formation of tumors as compared to the wildtype mouse. The transgenic 
mouse can be homozygous r heterozygous for flie disruption. 

hi another embodiment, the invention provides a transgenic mouse having germline and 
somatic cells in which at least one allele of a genomic SLC5A8 gene is disrupted by a 
chromosomally incorporated transgene, which transgene inhibits the expression of the genomic 
SLC5A8 gene, wherein (i) the genomic SLC5A8 gene encodes a SLC5A8 protein; and (ii) the 
disruption comprises insertion of a selectable marker sequence, which replaces aU or a portion of 
the genomic SLC5A8 gene or is inserted into the coding sequence of the genomic SLC5A8 
gene; and (iii) tiie transgenic mouse has increased susceptibility to the development of 
neoplasms. 
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In another embodiment, the invention provides isolated mammalian cells comprising a 
dq)loid genome including a chromosomally incorporated transgene. which transgene disrupts the 
genomic SLC5A8 gene and inhibits expression of said gene. Optionally, the cells are mouse 
cells. 

In another embodiment, the invention provides a method for generating a mouse and 
mouse embryonic stem cells having a functionally disrupted endogenous SLC5A8 gene, 
comprising the steps of: (i) constructing a transgene construct including (a) a recombination 
region having all or a portion of the endogenous S1X:5A8 gene, which recombination region 
directs recombination of the transgene with the endogenous SLC5A8 gene; and (b) a marker 
sequence which provides a detectable signal for identifying the presence of the transgene in a 
cell; (ii) transferring the transgene into embryonic stem ceUs of a mouse; (iii) selecting 
embryonic stem cells having a correctly targeted homologous recombination between the 
transgene and the SLC5A8 gene; (iv) transferring said cells identified in step (iii) into a mouse 
blastocyst and implanting the resulting chimeric blastocyst into a female mouse; and 
(v) selecting offepring harboring an endogenous SLC5A8 gene allele comprising (he conrectly 
targeted recombination. 

In another embodiment, the invention provides a method of evaluating the carcinogenic 
potential of an agent comprising: (i) contacting the transgenic mouse of claim 16A with a test 
agent; and (ii) comparing the number of transformed cells in a sample firem the treated mouse 
with the number of transformed cells in a sample fixnn an untreated transgenic mouse or 
transgenic mouse treated with a control agent, wherein the difference in the number of 
transformed cells m the treated mouse, relative to the number of transformed cells in the absence 
of treatment or treatment with a control agent, indicates the carcinqgenic potential of the test 
compound. 

hi another embodiment, the invention provides a method of evaluating an anti- 
proUferative activity of a test compound, comprising: (i) providmg a transgenic mouse of claim 
16Ahavmg germhne and somatic cells in which the expression of the SLC5A8 gene is inhibited 
by said chromosomally incorporated transgene, or a sample of cells derived therefrom; (ii) 
contacting the transgenic mouse or the sample of cells with a test agent; and (iii) determining the 
number of transfonned cells in a specimen from the transgenic mouse or m the sample of cells,- 
wherein a statistically significant decrease in the number of transformed cells, relative to the 
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number of transfonned cells in the absence of the test agent, indicates the test compound is a 
potential anti-proliferative agoit. 

In certain aspects, the present invention is based, at least in part, on Applicants* 
discovery of a particular human genomic DNA region in which the cytosines wifliin CpG 
dinucleotides are methylated in tissues from human cancers and unmethylated in normal human 
tissues. The region is referred to hereinafter as the «SLC5A8-methylation target region" is 
encompassed by base pairs 82200 to 83267 of GenBank entry AC063951, and is located in the 
promoter and/or exon 1 of the SLC5A8 gene. The present methods are also based, at least in 
part, on AppUcants' discovery that the levels of SLC5A8 transcript in tissues from human 
cancers are lower than the levels of SLC5 A8 transcript in normal tissues. 

In one embodiment, the method comprises assaying for the presence of differentially 
methylated SLC5A8 nucleotide sequences (e.g., in the SLC5A8 methylation target region) in a 
tissue sample or a bodily fluid sample from a subject. Preferred bodily fluids include blood, 
serum, plasma, a blood-derived fraction, stool, colonic effluent or urine. In one embodiment, 
the method involves restriction enzyme/methylation-sensitive PGR. In another embodiment, the 
method comprises reacting DNA from the sample with a chemical compound that converts non- 
methylated cytosine bases (also caUed "conversion-sensitive" cytosines), but not methylated 
cytosine bases, to a different nucleotide base. In a preferred embodiment, the chemical 
conipound is sodium bisulfite, which converts unmethylated cytosine bases to uracil. The 
compound-converted DNA is then amplified using a methylation-sensitive polymerase chain 
reaction (MSP) employing primers that amplify the compound-converted DNA template if 
cytosine bases within CpG dmucleotides of the DNA from the sample are methylated. 
Production of a PC31 product indicates that the subject has cancer or precancerous adenomas. 
Other methods for assaying for the presence of methylated DNA are known in the art. 

In another embodiment, the method comprises assaying for decreased levels of an 
SLC5A8 transcript in the sample. A sequence of the SLC5A8 transcript (SEQ ID NO: 3) is 
shown in Figure 2. Tlie SLC5A8 transcript is encoded by 15 exons within the present genomic 
contig. hi another aspect the method comprises assaying for decreased levels of a protein 
encoded by the SLC5A8 transcript in the sample. 

. In another embodiment, the present invention provides a detection method for prognosis 
of a cancer (e.g., colon cancer, breast cancer, thyroid cancer, or stomach cancer) in a subject 
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known to have or suspected of having cancer. Such method comprises assaying for the presence 
of methylated SLC5A8 DNA (e.g., in the SLC5A8 methylation target region) in a tissue sample 
or bodily fluid from the subject In certain cases, it is expected that detection of methylated 
SLC5A8 DNA in a blood fraction is indicative of an advanced state of cancer (e.g., colon 
cancer). In other cased, detection of methylated SLC5A8 DNA in a tissue or stool derived 
sample or sample from other bodily fluids may be indicative of a cancer that will respond to 
therapeutic agents that demethylate DNA or reactivate expression of the SLC5 A8 gene. 

In another embodiment, the present invention provides a mediod for monitoring over 
time the status of cancer (e.g., colon cancer, breast cancer, thyroid cancer, or stomach cancer) in 
a subject. The method comprises assaying for the presence of methylated SLC5A8 DNA (e.g., 
in the SLC5A8 methylation target region) in a tissue sample or bodily fluid taken from the 
subject at a first time and in a corresponding tissue sample or bodily fluid taken from the subject 
at a second time. Absence of methylated SLC5A8 DNA from the tissue sample or bodily fluid 
taken at the first time and presence of methylated SLC5 A8 DNA in the tissue san:5)le or bodily 
fluid taken at the second time indicates that the cancer is progressing. Presence of methylated 
SLC5A8 DNA in the tissue sample or bodily fluid taken at the first time and absence of 
methylated SLC5A8 DNA from the tissue sample or bodily fluid taken at the second time 
indicates that the cancer is regressing. 

In another embodiment, the present invention provides a method for evaluating therzqpy 
in a subject having cancer or suspected of having cancer (e.g., colon cancor, breast cancer, 
thyroid cancer, or stomach cancer). The method comprises assaying for the presence of 
methylated SLC5A8 DNA (e.g., in the SLC5A8 mefliylation target region) in a tissue sample or 
bodily fluid taken from the subject prior to therapy and a corresponding bodily fluid taken fijom 
flie subject during or following therapy. Loss of or a decrease in the levels of methylated 
SLC5A8 DNA in the sample taken after or during tiierapy as compared to the levels of 
methylated SLC5A8 DNA in the sample taken before ther^y is indicative of a positive effect of 
the ther^y on cancer regression in the treated subject. 

The present invention also relates to ohgonucleotide primer sequences for use in assays 
(e.g., methylation-sensitive PCR assays or HpaU assays) designed to detect the methylation 
status of the SLC5A8 gene. The present invention also relates to antibodies and to 
oligonucleotides or oligomers for detecting the presence the SLC5A8 protein or the SLC5A8 
4 :«* -respectively, in samples obtained from a subject. 
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The present invention also provides a method of inhibiting or reducing growth of cancer 
cells (e.g., colon cancer, breast cancer, thyroid cancer, or stomach cancer). The method 
comprises increasing the levels of the protein encoded by SLC5 A8 in cancer cells. In one 
embodiment, the cells are contacted with the SLC5A8 protein or a biologically active equivalent 
or fragment thereof under conditions permitting uptake of the protein or fragment. In another 
embodimrat, the cells are contacted with a nucleic acid encoding the SLC5A8 protein and 
comprising a promoter active in the cancer cell, wherein the promoter is operably Unked to the 
region encoding the SLC5A8 protein, under conditions permitting the uptake of the nucleic acid 
by the cancer cell. In another embodiment, the method comprises demethylating the methylated 
SLC5A8 DNA, or otherwise reactivating the silenced SLC5 A8 promoter. 

In one embodiment, the application provides isolated or recombinant SLC5A8 nucleotide 
sequences that are at least 80%, 85%, 90%, 95%, 98%, 99% or identical to the nucleotide 
sequence of any one of SEQ ID NOs: 2-4 and 21, fragments of said sequences that are 10, 15, 
20, 25, 50, 100, or 150 base pairs in length wherein the SLC5A8 nucleotide sequraices are 
diflFerentially methylated in an SLC5 A8-associated disease cell. 

In another embodiment, the application provides a method for detecting colon cancer, 
comprising: a) obtaining a sample from a patient; and b) assaying said sample for the 
presence of methylation of nucleotide sequences within at least two genes selected from the 
group consistmg of: SLC5A8, HLTF, pl6, and hMLHl; wherem methylation of nucleotide 
sequences within the two genes is indicative of colon cancer. In such methods, the sample is a 
bodily fluid selected from the group consisting of blood, serum, plasma, a blood-derived 
fraction, stool, urine, and a colonic effluent. For example, the bodily fluid is obtained from a 
subject suspected of having or is known to have colon cancer. 

In another embodiment, the application provides a kit for detecting colon cancer in a 
subject, comprising primers for detecting methylation of nucleotide sequence within at least two 
genes selected from the group consisting of: SLC5A8, HLTF, pl6, and hMLHl, wherein the 
primers for detecting methylation of SLC5A8 nucleotide sequence are selected from SEQ ID 
NOs: 5-11; wherein the primers for detecting methylation of HLTF nucleotide sequence are 
selected from 5'-TGGGGTTTCGTGGTTTTTTCGCGC-3^ 5'- 

CCGCGAATCCAATCAAACGTCGACG-3', 5'- 
ATTmGGGGTTrTGTGGTTTTTTTGTGT-3\ 5'- 
ATr^Ann \CAAATCCAATCAAACATCAACA-3\ 5'- 
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GCACGACTAAAAAATAAATCGCCGCG-3', 5'- 

AAAGACACAACTAAAAAATAAATCACCACA-3\ 5'- 
TAAAACCTCGTAACTTTCCCGCGCG-3*, 5*-GTCGCGAGTTTAGTTAGACGTCGAC-3', 

5'-TCCTAAAACCTCATAACTTTCCCACACA-3', and 5'- 

AGTTGTTGTGAGTTTAGTTAGATGTTGAT-3',whea:ein the primers for detecting 

methylation of hMLHl nucleotide sequence are selected from 

5'AACGAATTAATAGGAAGAGCGGATAGCG-3', 5'- 

CGTCCCTCCCTAAAACGACTACTACCC-3', 5'- 

CGTTTTTTTTTGAAGCGGTTATTGTTTGT^ and 5*- 



AACGAACCAATAAAAAAAACAAACAACG-3\ Tthe kit may further comprise a 
compound to convert a template DNA. Optioanally the compound is bisulfite. 

Brief Description of The Drawings 

Figure 1 shows the complete sequence of the Genomic clone AC063951 (SEQ ID NO: 

2) , with nucleotides 82200-83267 underlined on pages 35 of Figure 1. This region (nucleotides 
82200-83267 of AC063951, SEQ ID NO: 12, see Figure 4) encompasses the promoter and/or 
exon 1 of the SLC5A8 gene, and is herein referred to as flie "SLC5A8 methylation target 
region/' 

Figure 2 shows the nucleotide sequence of the SLCSA8 mRNA transcript (SEQ ID NO: 

3) . The SLC5A8 transcript is encoded by 15 exons within the present genomic contig. 

Figure 3 shows a diagram of the SLC5A8 methylation target region. CpG sites are 
shown with circles and stems. The numerical coordinates are those of genomic clone 
AC063951. Lollipops designate CpG sites that are potential acceptors of aberrant methylation. 
Asterisks designate sites recognized by the Hpall restriction enzyme. Shown are the positions of 
PGR primers that amplify regions crossing 6 Hpall sites, or regions crossing 4 Hpall sites. Also 
shown is the position of PGR primers designed for a methyl-specific PGR (MS-PCR) assays. 
Also shown in the gray bar is the 5 ' end of exon 1 of the SLG5A8 transcript which overlaps with 
the methylation sites detected in both MS-PCR and Hpall based assays. Lastly mdicated is a 
NotI site corresponding to methylation site 2D41 detected in Restriction Landmark Genome 
Scaiming assay as methylated in colon cancer cell lines, though not in primary tumors. 
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Figure 4 provides the sequence of AC063951 between nucleotides 82200-83267 (SEQ 
JD NO: 12), and designates every CpG site vntii a gray lollipop, and shows the HpaH sites in the 
assay as daric lolUpops, and also shows the location of the PGR primers used in the assay. In this 
figure, the base pairs have been renumbered sequentiaUy from 1-1068, with nucleotide 82200 
being renumbered as nucleotide 1. 

Figure 5 shows the correlation between HpaH assays (over 4 HpaH sites and 6 HpaH 
sites) and silencing of expression of tiie SLC5A8 transmpt. 

Figure 6 shows the results of the HpaH assays (over 4 HpaH sites and 6 BpsH sites) in 
actual colon cancer tumors and nomial control colon tissues. 

Figure 7 shows the results of assay for methylation at 61 CpG sites enumerated in Figjjre 
4 with site 1 corresponding to basepair 466 in Figure 4 and ate 61 corresponding to basepair 
1010. The bold arrows correspond to 4 of the HpaH sites at respectively basepairs 466, 691, 
709, and 716 in Figure 4. Methylation was assayed by sequencing DNA firom samples 
following sodium bisulfite treatment of DNA that converts cytosine to uracil but leaves methyl- 
cytosine unchanged. Bases that are methylated are coded black, unmethylated bases are coded 
dark gray, and samples with both methylated and unmethylated bases are coded light gray. 

Figure 8 shows the wild-type sequence of the anti-sense strand of AC063951 between 
bases 82200-83267 (SEQ ID NO: 13). Note that flie sequence is the reverse complement of that 
shown in Figure 4, and therefore base number 1 on this diagram corresponds to basepair 83267 
in AC06395 1, and to basepair 1068 in Figure 4. hidicated on this diagram is the position of the 
MS-PCRl primers (AS-meth) and the UMS-PCRl primers (AS-unmefliy). The methyl specific 
MS-PCRl primers anq>hfy a CpG sites numbered 6, 7, 8 and 15, 16, 17, 18 respectively in 
Figure 7. The UMS-PCRl primers interrogate CpG sites 7, 8 and 15, 16, 17, 18 respectively. 

Figure 9 shows a region within SEQ ID NO: 13 shown in Figure 8 (nucleotides 300-600, 
SEQ ID NO: 14), and the sequences of the antisense strand that are amplified by the methyl- 
specific and umnethyl-specific PCR primers. 

Figure 10 shows the bisulfite converted sequence of a uniformly methylated SLC5A8 
antisense strand (SEQ ID NO: 1 5),butnot the wild-type sequence of the SLC5A8 antisense 
strand (corresponding to Figure 8). hdicated again are the position of the methylation specific 
PCR primers for the MS-PCRl assay. 
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Figure 1 1 shows the bisulfite converted sequence of a unifoimly unmethylated SLC5 A8 
antisense strand (SEQ ID NO: 1 6), but not the wild-type sequence of the SLC5A8 antisense 
strand shown in Figure 8. Indicated are the position of the unmethylation specific PGR primers 
fortheUMS-PCRl assay. 

Figure 12 provides the bisulfite converted sequence of the unmethylated SLC5A8 sense 
strand of nucleotides 82200-83267 of AC063951, renumbered such that basepair 82200 is 
designated as nucleotide 1 (SEQ ID NO: 17). 

Figure 13 provides the bisulfite converted sequence of a uniformly methylated SLC5A8 
sense strand of nucleotides 82200-83267 (SEQ ID NO: 1 8). 

Figure 14 shows the tabular results of MS-PCRl assay performed on 31 colon cancer 
cell lines that do or do not express the SLC5 A8 transcript. 

Figure 15 shows the tabular results of MS-PCRl assay performed on 63 matched sets of 
primary colon cancer tumor tissue and accompanying normal colon tissue. 

Figure 16 shows the results of testing 12 normal colon tissues fi-om individuals without 
colon cancer. 

Figure 17 shows the tabular results of the MS-PCRl assay of 28 premaUgnant colon 
adenomas, 68% of which are detected. 

Figure 18 shows the amino add sequence (SEQ ID NO: 1) of theSLC5A8 protein. 

Figure 19 shows RT-PCR detection of the SLC5A8 transcript in normal colon and in a 
minority subset of colon cancer cell lines. 

Figure 20 shows RT-PCR detection of SLC5A8 transcript in colon cancer cell lines that 
have been treated with the DNA-demethylating agent 5-azacytidine. 5-azacytidine reactivates 
expression of the SLC5A8 gene in 6 of 8 colon cancer cell lines. 

Figure 21 demonstrates detection of methylation of the SLC5A8 locus by showing 
resistance o f t he 1 ocus t o H pall d igestion. T he 4 H pall assay ( as d escribed i n t he i nvention 
disclosure) is based on PGR amplification of a portion of the SLC5A8 locus. Lanes labeled U 
show control amplification of undigested SLC5A8 DNA. Lanes labeled M show amplification 
"""" T i .1 j^^g g^g^ j^ggj^ restriction enzyme Msp 1 . 



-11- 



wo 03/104427 PCT/US03/18239 

Figure 22 demonstrates detection of SLC5A8 DNA methylation in primary colon cancer 
tumors but not in matched normal tissue from the same patients. Samples labeled T represent 
colon cancer tumor tissue; whereas samples labeled N represent the matched normal tissue. 

Figures 23A-23B show the identification of SLC5A8. (A) Shown is the genomic 
structure of the SLC5A8 gene. Black boxes represent exons, and arrows the start codon and 
stop codons respectively. (B) The nucleotide sequence of the SLC5A8 coding region (SEQ ID 
NO: 4). 

Figures 24A-24F show SLC5A8 expression. (A) Shown is RT-PCR analysis 
demonstrating SLC5A8 transcript expression in three normal colon mucosa samples (Nl, N2, 
N3), but absence of SLC5A8 transcript in most colon cancer cell lines (remaining samples). (B) 
Shown is RT-PCR analysis demonstrating reactivation of SLC5A8 expression in cell lines 
treated with 5-azacytidine (+) compared to untreated (-) controls. (C) Methylation specific PGR 
(MS-PCR) assay for methylated (M) or unmethylated (U) SLC5A8 exon 1 sequences detects 
exchisively methylated templates in SLC5A8 silenced cell lines. (D) MS-PCR detects only 
unmethylated SLC5A8 templates in SLC5A8 expressing cell lines. (E) MS-PCR detection of 
methylated SLC5A8 templates in colon cancer tumors (T) antecedent to SLC5A8 methylated cell 
lines (V425, V670). Matched normal colon tissue (NO shows only unmethylated templates. 
Unmethylated templates in tumor tissue presunq)tively arise fiom contaminating non-malignant 
cells. (F) MS-PCR analysis of colon cancer tumors (T) and matched normal (N) colon tissues. 
Methyl specific bands are seen in each of tiie tumor sanq)les, but none of the normal controls. 

Figures 25A-25B show real time MS-PCR analysis of StCX4S methylation. Plotted are 
1000 times the ratio of measured SLC5A8 methylated product to the control MYOD] derived 
product. (A) Detection ofSLCSAS methylation in primary colon cancer tissues. Column 1 
displays values for normal colon tissues harvested fiom non-cancer resections (dark diamonds). 
Column 2 displays values for normal colon tissues harvested fiom colon cancer resections (dark 
diamonds). Column 3 displays values for colon cancer tissues divided into unmethylated 
samples falling within the normal tissue range (dark diamonds at the bottom), versus methylated 
samples shovnng values greater than the normal tissue range (light diamonds at the top). 
Adjacent bars indicate population means. (B) Real time MS-PCR analysis of SLC5A8 
methylation in aberrant crypt foci. Column 1 displays vahies for 24 normal colon tissues 
harvested from colon resections from 1 1 individuals (dark diamonds). Column 2 displays values 
r— 1 c -i.~xant crypt foci harvested from the same 1 1 individuals' resections. Dark diamonds (at 
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the bottom) indicate tmmelhylated samples within the normal range, and light diamonds (at the 
top) indicate metihiylated samples Ming within the range previously demonstrated by 
methylated cancers. Adjacent bars indicate the mean value for each group. 

Figure 26 shows real tune MS-PCR analysis of SLC5A8 methylation in DNA 
precipitated jfrom the serum of colon cancer patients. Plotted are 1000 times the ratio of 
measured SLC5A8 methylated product to the control MYODl derived product. Column 1 
displays absence of detectable SLC5A8 methylation in serum of 13 individuals whose colon 
cancer tumors assayed as unmethylated by MS-PCH (dark diamonds at the bottom). Column 2 
displays values of SLC5A8 methylation in the serum of 10 individuals whose colon cancer 
tumors assayed as methylated by MS-PCR. Dark diamonds (at the bottom) indicate 6 sera 
without detectable methylation, and light diamonds (at the top) indicate 4 sera in which SLC5A8 
methylation was detectable. 

Figures 27A-27B show SLC5A8 suppression of colon cancer colony formation. Shown 
are the number of G418 resistant colonies arising from transfection with a SLC5A8 expression 
vector (SLC5 A8) or a control empty expression vector ^cDNA) in SLC5A8 uranetiiylated and 
expressing V364, V457, and V9M cells (panel A) as compared to SLC5A8 methylated and 
deficient FET, V400, and RKO cells (panel B). 

Figure 27 shows the cloning ofSLC5A8 transcript. Black bars indicate representative 
ESTs. The lighter gray bar indicates sequence generated from an image clone. The dark gray 
bar indicates open reading frame encoding SLC5 A8 protein. 

Figure 28 shows the protein aligranents of SLC5A8, the closest murine homologue of 
SLC5A8, the human sodium iodide symporter SLC5A5, and the human sodium dependent 
multivitamin transporter SLC5A6. 

Figures 30A-30B show methylation in SLC5A8 exon 1 . (A) Diagrammatic 
representation of the CpG island in SLC5A8 exon 1. Balloons represent CpG dinucleotides. 
Coordinates represent nucleotide positions numbered as per GenBank entry AC063951. 
Positions of the ATG and NotI site are indicated. Arrows cover the regions interrogated by 
primers for MS-PCR. (B) Diagrammatic summary of methylation status of the 62 CpG sites in 
SLC5A8 exon 1 as determined by sequencing of bisulfite converted genomic DNA. Each site is 
sequentially represented by one shaded block. Black represents sites that are fiiUy methylated. 
~ ' { represents sites that are fully unmethylated. And Ughter gray represents sites that 



-13- 



are 



WO 03/104427 PCTAJS03/18239 

partiaUymelhylated. Samples include 9 SLC5A8 silenced cell lines (Off samples), 6 
SLC5A8 expressing normal colonic mucosa (On samples designated N), and 3 SLC5 A8 
expressing cell lines (On samples designated V). Arrows indicate sites that are interrogated by 
MS-PCR primers and bracket a differentially methylated region that is unmethylated in SLC5A8 
expressing samples and is methylated in SLC5A8 silenced samples. 

Figure 30 shows methylation events in primary colon cancers. Shown is analysis of 64 
primary colon cancers for aberrant methylation at 4 genomic loci, SLC5A8, HLTF, hMLHl, and 
pl6. Black bare represent positive assays for methylation in tumor tissue, and gray bars 
represent detection only of unmeliiylated alleles. 

Figure 31 shows suppression of xenograft growth in 4 of 5 SLC5A8 expresshxg V400 
transfected clones (square symbols, gray lines) as compared with control pools of V400 cells 
transfected with an empty expression vector (triangular symbols, black lines). 

DetaOed Description of the Invention 

I. Definitions 

For convenience, certain terms employed in the specification, examples, and appended 
claims are collected here. Unless defined otherwise, all technical and scientific terms used 
herein have the same meaning as commonly understood by one of ordinary skill in the art to 
which this invention belongs. 

The articles "a" and "an" are used herein to refer to one or to more flian one (i.e., to at 
least one) of the grammatical object of the article, unless the context clearly indicates olherwise. 
By way of example, "an elemenf means one element or more than one element 

Hie tenns "adenoma", "colon adenoma," and "polyp" are used herein to describe any 
precancerous neoplasia of the colon. 

The term "blood-derived fraction" herein refers to a componwit or components of whole 
blood. Whole blood comprises a Uquid portion (i.e., plasma) and a soUd portion (i.e., blood 
cells). The liquid and solid portions of blood are each comprised of multiple components; e.g., 
different proteins in plasma or different ceU types in die soUd portion. One of these components 
or a mixture of any of these components is a blood-derived fi^on as long as such fiaction is 
mJccinct ntie or more components found in whole blood. 
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"Cells " *'host cells" or "recombinant host cells" are terms used interchangeably herein. 
It is nnderstood that such terms refer not only to the particular subject cell but to the progeny or 
potential progeny of such a cell. Because certain modifications may occur in succeeding 
generations due to either mutation or environmental influences, such progeny may not, in feet, 
be identical to the parent cell, but are still included within the scope of the term as used herein. 

A "chimeric polypeptide" or "fusion polypeptide" is a fusion of a first amino acid 
sequence with a second amino acid sequence where the first and second amino acid sequences 
are not naturally present in a single polypeptide cham. 

The term "colon" as used herein is intended to encompass the right colon (including the 
cecum), the transverse colon, the left colon, and the rectum. 

The terms "colorectal cancer" and "colon cancer" are used interchangeably herein to 
refer to any cancerous neoplasia of the colon (including the rectum, as defined above). 

The terms "compound", "test compound," and "agent" are used herein mterchangeably 
and are meant to include, but are not limited to, peptides, nucleic acids, carbohydrates, small 
organic molecules, natural product extract libraries, and any other molecules (including, but not 
limited to, chemicals, metals, and organometallic compounds). 

The term "compound-converted DNA" herein refers to DNA that has been treated or 
reacted with a chemical compound that converts unmethylated C bases in DNA to a different 
nucleotide base. For example, one such compound is sodium bisulfite, which converts 
unmethylated C to U. If DNA that contains conversion-sensitive cytosine is treated with sodium 
bisulfite, t he c ompound-converted D NA w ill c ontain U i n p lace o f C . I f t he D NA w hich i s 
treated with sodium bisulfite contains only methylcytosine, the compound-converted DNA will 
not contain uracil in place of the methylcytosine. 

The term "de-methylating agent" as used herein refers agents that restore activity and/or 
gene expression of target genes silenced by methylation upon treatment witti the agent 
Examples of such agents include without limitation 5-azacytidine and 5-a2a-2 -deoxycytidine. 

The term "detection" is used herein* to refer to any process of observing a marker, in a 
biological sample, whether or not the marker is actually detected. In other words, the act of 
probing a sanq)le for a marker is a "detection" even if the marker is determined to be not present 
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orbelow the level of sensitivity. D etectionmay be a quantitative, semi-quantitative or non- 
quantitative obsavation. 

The term "differentially methylated SLC5A8 nucleotide sequence" refers to a region of 
the SLC5A8 nucleotide sequence that is found to be metiiylated iii a SLC5A8-associated cancer 
such as a region of the SLC5A8 nucleotide sequence that is found to be methylated in cancer 
tissues or cell lines, but not methylated in the normal tissues or cell lines. For example. Figure 3 
delineates certain SLC5A8 regions tiiat are differentially methylated, such as SEQ ID NOs: 11- 
13. 

"Expression vector" refers to a replicable DNA construct used to express DNA which 
encodes the desired protein and which includes a transcriptional unit comprising an assembly of 
(1) genetic element(s) having a regulatory role in gene expression, for example, promoters, 
operators, or enhancers, operatively linked to (2) a DNA sequence encoding a desired protein (in 
this case, a SLC5A8 protein) which is tianscrfeed into mRNA and translated into protein, and 
(3) appropriate transcription and translation initiation and termination sequences. The choice of 
promoter and other regulatory elements generaUy varies accordmg to the intended host cell. In 
general, expression vectors of utihty in recombinant DNA techniques are often in tiie form of 
"plasmids" which refer to circular double stranded DNA loops which, in their vector form are 
not bound to tiie chromosome. In tiie present specification, "plasmid" and 'Vector" are used 
interchangeably as the plasmid is the most commonly used form of vector. However, the 
invaition is intended to include such other forms of expression vectors which serve equivalent 
functions and which become known in the art subsequentiy hereto. 

In the expression vectors, regulatory elements controlling transcription or translation can 
be generally derived from mammalian, microbial, viral or insect genes. The abiUty to repUcate 
in a host, usually conferred by an origin of replication, and a selection gene to fecilitate 
recognition of transformants may additionally be incorporated. Vectors derived from viruses, 
such as retroviruses, adenoviruses, and the like, may be employed. 

As used herein, the phrase "gene expression" or "protein expression" includes any 
information pertaining to the amount of gene transcript or protein present in a sample, as well as 
information about the rate at which genes or proteins are produced or are accumulating or being 
degraded (e.g.. reporter gene data, data from nuclear runoff experiments, pulse-chase data etc.). 
Certain kinds of data might be viewed as relating to both gene and protein ejqnession. For 



-16- 



wo 03/104427 



PCT/US03/18239 



example, protem levels in a cell are reflective of the level of protein as well as the level of 
transcription, and such data is intended to be included by the phrase "gene or protein expression 
infonnation." Such information may be given in the form of amounts per cell, amounts relative 
to a control gene or protein, in unitless measures, etc.; the term "information" is not to be limited 
to any particular means of representation and is intended to mean any representation that 
provides relevant information. The term "expression levels'* refers to a quantity reflected in or 
derivable from the gene or protein expression data, whether the data is directed to gene transcript 
accumulation or protein accumulation or protein synthesis rates, etc. 

The terms "health^, **normal," and **non-neoplastic" are used interchangeably herein to 
refer to a subject or particular cell or tissue that is devoid (at least to the limit of detection) of a 
disease condition, such as a neoplasia (e.g., cancer), that is associated with SLC5A8 such as for 
example n eoplasia a ssociated w ith s ilencing o f S LC5 A8 g ene e xpression d ue t o m ethylation. 
These terms are often used herein in reference to tissues and cells of the colon. Thus, for the 
purposes of this application, a patient with severe heart disease but lacking a SLC5A8 silencing- 
associated disease would be termed "healthy." 

"Homology" or "id^tit/* or "similarity" refers to sequence similarity between two 
peptides or between two nucleic acid molecules. Homology and identity can each be determined 
by comparing a position in each sequence which may be aligned for purposes of comparison. 
When an equivalent position in the compared sequences is occupied by the same base or amino 
acid, then the molecules are identical at that position; when the equivalent site occupied by the 
same or a similar amino acid residue (e.g., similar in steric and/or electronic nature), &en the 
molecules can be referred to as homologous (similar) at that position. Expression as a 
percentage of homology/similarity or identity refers to a function of the number of identical or 
similar amino acids at positions shared by the compared sequences. A sequence which is 
'^unrelated" or *'non-homologoxis" shares less than 40% identity, preferably less than 25% 
identity with a sequence of the present invention. In conq)aring two sequences, the absence of 
residues (amino acids or nucleic acids) or presence of extra residues also decreases the identity 
and homology/similarity. 

The term 'Tiomology" describes a mathematically based comparison of sequence 
similarities which is used to identify genes or proteins with similar functions or motifs. The 
nucleic acid and protein sequences of the present invention may be used as a "query sequence" 
^'^^ a s earch a gainst p ubUc d atabases t o, f or e xample, i dentify o ther f amily m embers, 
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related sequences or homologs. Such searches can be perfonned using the NBLAST and 
XBLASTprograms (version 2.0) of Altschul, etal (1990) JMo/. 5 zb/. 2 15:403-10. BLAST 
nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to 
obtain nucleotide sequences homologous to nucleic acid molecules of the iavention. BLAST 
protem searches can be performed with the XBLAST program, score=50, wordlength=3 to 
obtain amino acid sequences homologous to protein molecules of the invention. To obtain 
gapped alignments for c omparison p urposes, G apped BLAST c an b e u tilized a s d escribed i n 
Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and 
Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST 
and BLAST) can be used. See http://www.ncbi.nhn.nih.gov. 

As used herein, "identity" means the percentage of identical nucleotide or amino acid 
residues at corresponding positions in two or more sequences when the sequences are aligned to 
maximize sequence matching, i.e., taking into account gaps and insertions. Identity can be 
readily calculated by known methods, including but not limited to those described in , 
(Computational Molecular Biology, Lesk, A. M., ed., Oxford IMversity Press, New York, 1988; 
Biocomputing: Informatics and Genome Projects, Smith, D. W., ed.. Academic Press, New 
York, 1993; Computer Analysis of Sequence Data, Parti, Griffin, A. M., and GriflBn, H. G., 
eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, 
G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., 
M Stockton Press, New York, 1991; and Carillo, H-, and Lipman, D., SIAM J. Applied Math., 
48: 1073, 1988). Methods to determine identity are designed to give the largest match between 
the sequences tested. Moreover, methods to detennine identity are codified in publicly available 
computer programs. Computer program methods to determine identity between two sequences 
include, but are not Umited to, the GCG program package QDevereux, J., et al.. Nucleic Acids 
Research 12(1): 387 (1984)), BLASTP, BLASTN, and FASTA (Altschul, S- F. et al., X Molec, 
Biol. 215: 403-410 (1990) and Altschul et al. Nuc. Acids Res. 25: 3389-3402 (1997)). The 
BLAST X program is pubUcly available firom NCBI and other sources (BLAST Manual, 
Altschul, S., et al., NCBI NLM NIH Bethesda, Md. 20894; Altschul, S., et al., J. Mol Biol 215: 
403-410 (1990)). The well known Smith Waterman algorithm may also be used to determine 
identity. 

"SLC5A8-associated cancer'* refers to cancer associated with reduced expression or no 
expression of the SLC5A8 gene (previously referred to as the Huil gene), and cancer associated 
rential methylation of SLC5A8 DNA. Examples of SLC5A8-associated cancer 
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include, but are not limited to, colon cancer, breast cancer, thyroid cancer, and stomach cancer. 
As used herein, the SLC5A8-associated cancers includes both cancers and pre-cancer adenomas. 

"SLC5A8-associated proliferative disorder" refers to a disease that is associated with 
either reduced expression or over-expression of the SLC5A8 gene. 

A "SLC5A8-associated protein" refers to a protein capable of interacting with and/or 
bmding to a SLC5A8 polypeptide. GeneraUy, the SLCSAS-assodated protein may interact 
directly or indirectly with the SLC5A8 polypeptide. 

"SLC5A8-methylation target regions" as used herein refer to those regions of SLC5A8 
that are found to be methylated. These regions include nucleotide regions that may be either 
constitutively or differentially methylated regions. For example. Figure 3 discloses a SLC5A8 
region wherein certain sequences of this region are differentially methylated regions. 

"SLC5A8-nucleotide sequence" or "SLC5A8-nucleic acid sequence" as used herein 
refers to the SLC5A8 nucleotide sequences as set forth in SEQ ID NOs: 2-7 and ftagments 
ttiereof. 

"SLCSAS-silendng associated diseases" as used herein includes SLC5A8-associated 

cancer. 

The term "including" is used herein to mean, and is used interchangeably with, the 
phrase "mcluding but not limited to." 

The term "isolated" as used in reference to nucleic acids or polypeptides indicates a 
nucleic acid or polypeptide, such as a SLC5A8 nucleic acid or polypeptide, that is isolated from, 
or otherwise substantially free of other proteins that are normally associated with the nucleic 
acid or polypeptide. 

The term "methylation-sensitive PGR" (i.e., MSP) herein refers to a polymerase chain 
reaction in which amplification of the compound-converted template sequence is perfcamed. 
Two sets of primers are designed for use in MSP. Each set of primers comprises a forward 
primer and a reverse primer. One set of primers, called methylation-specific primers, wUl 
ampUfy the compound-converted template sequence if C bases in CpG dinucleotides within the 
template DNA (e.g., a SLC5A8 nucleic acid) are methylated. Another set of primers, called 
nTimethvlfltion-specific primes, will amplify the compound-converted template sequaices if C 



-19- 



wo 03/104427 



PCTAJS03/18239 



bases in CpG dinucleotides within the template DNA (e.g., a SLC5A8 nucleic acid) are not 
methylated. 

The term **nucleic acid" refers to polynucleotides such as deoxyribonucleic acid (DNA), 
and, where appropriate, ribonucleic acid (RNA). The term should also be understood to include, 
as equivalents, analogs of either RNA or DNA made &om nucleotide analogs, and, as applicable 
to the embodiment being described, single (sense or antisense) and double-stranded 
polynucleotides. 

"Operably linked" when describing the relationship between two DNA i-egions simply 
means that they are functionally related to each other. For exan^le, a promoter or other 
transcriptional regulatory sequence is operably linked to a coding sequence if it controls the 
transcription of the coding sequence. 

The term "of is used herem to mean, and is used interchangeably with, the term 
"and/or", unless context clearly indicates otherwise. 

The terms '"polypeptide" and '"protein" are used interchangeably herein. 

The term ''recombinant" as used in reference to a nucleic acid indicates any nucleic acid 
that is positioned adjacent to one or more nucleic add sequences that it is not found adjacent to 
in nature. A recombinant nucleic acid may be generated in vitro, for example by using the 
methods of molecular biology, or in vivo, for example by insertion of a nucleic acid at a novel 
chromosomal location by homologous or non-homologous recombination. The term 
"recombinant" as used in reference to a polypeptide indicates any polypeptide that is produced 
by expression and translation of a recombinant nucleic acid. 

A "sample" includes any material that is obtained or prepared for detection of a 
molecular marker or a change in a molecular marker such as the methylation state, or any 
material that is contacted with a detection reagent or detection device for the purpose of 
detecting a molecular marker or a change in the molecular marker. 

A "subject" is any organism of interest, generally a mammalian subject, such as a mouse, 
and preferably a human subject. 

The term '^transgene" is used hisrein to describe genetic material which has been or is 
about to be artificially inserted into the genome of a mammal, particularly a mammalian cell of a 
lal. By "transgenic animal" is meant a non-human animal, usually a mammal (e.g., 
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mouse, rat, rabbit, hamster, etc.), having a non-endogenous nucleic acid sequence present as an 
extrachromosomal element in a portion of its cells or stably integrated into its geim line DNA 
(i.e., in the genomic sequence of most or all of its cells). Heterologous nucleic acid is 
introduced into the germ line of such transgenic animals by genetic manipulation of, for 
example, embryos or embryonic stem cells of the host animal. 

n. Overview 

In certain aspects, the invention relates, in part, to methods for determining whether a 
patient is likely or unlikely to have a cancer, for example, colon neoplasia. A colon neoplasia is 
any cancerous or precancerous growth located in, or derived firom, the colon. The colon is a 
portion of the intestinal tract that is roughly three feet in length, stretching &om the end of the 
small intestine to the rectum. Viewed in cross section, the colon consists of four distinguishable 
layers arranged in concentric rings surrounding an interior space, termed the lumen, through 
which digested materials pass. In order, moving outward from the lumen, the layers are termed 
the mucosa, the suhmucosa, the muscularis propria and the subserosa. The mucosa includes the 
epithelial layer (cells adjacent to the lumen), the basement membrane, the lamina propria and the 
muscularis mucosae. In general, the **wair' of the colon is intended to refer to the submucosa 
and the layers outside of the submucosa. The "lining" is the mucosa. 

Precancerous colon neoplasias are referred to as adenomas or adenomatous polyps. 
Adenomas are typically small mushroom-like or wart-like gro-wlhs on the lining of the colon and 
do not invade into the wall of the colon. Adenomas may be visualized through a device such as 
a colonoscope or flexible sigmoidoscope. Several studies have shown that patients who imdergo 
screening for and removal of adenomas have a decreased rate of mortality from colon cancer. 
For this and other reasons, it is generally accepted that adenomas are an obhgate precursor for 
the vast majority of colon cancers. When a colon neoplasia invades into the basement 
membrane of the colon, it is considered a colon cancer, as the term "colon cancer" is used 
herein. In describing colon cancers, this specification will generally follow the so-called 
"Dukes" colon cancer staging system. The characteristics that the describe a cancer are 
generally of greater significance than the particular term used to describe a recognizable stage. 
The most widely used staging systems generally use at least one of the following characteristics 
for staging: the extent of tumor penetration into the colon wall, with greater penetration 
generally correlating with a more dangerous tumor, the extent of invasion of the tumor through 
fii*»r.nlnTi wall and into other neighboring tissues, with greater invasion generally correlating 
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with a more dangerous tumor; the extent of invasion of the tumor into the regional lymph nodes, 
with greater invasion generally correlating with a more dangerous tumor; and the extent of 
metastatic invasion into more distant tissues, such as the hver, with greater metastatic invasion 
generally correlating mth a more dangerous disease state. 

••Dukes A" and "Dukes B" colon cancers are neoplasias that have invaded into the wall 
of the colon but have not spread into other tissues. Dukes A colon cancers are cancers that have 
not invaded beyond the submucosa. Dukes B colon cancers are subdivided into two groups: 
Dukes Bl and Dukes B2. "Dukes Bl" colon cancers are neoplasias that have invaded up to but 
not throughthemuscularis propria. Dukes B2 colon cancers are cancers that have breached 
completely through the muscularis propria. Over a five year period, patients with Dukes A 
cancer who receive surgical treatment (i.e., removal of the afTected tissue) have a greata: than 
90% survival rate. Over the same period, patients with Dukes Bl and Dukes B2 cancer 
receiving surgical treatment have a survival rate of about 85% and 75%, respectively. Dukes A, 
Bl and B2 cancers are also referred to as Tl, T2 and T3-T4 cancers, respectively. 'T)ukes C" ^ 
colon cancers are cancers that have spread to the regional lymph nodes, such as the lymph nodes 
of the gut. Patients with Dukes C cancer who receive surgical treatment alone have a 35% 
survival rate over a five year period, but this survival rate is increased to 60% in patients tiiat 
receive chemotherapy. " Dukes D" colon cancers are cancers that have metastasized to other 
organs. The liver is the most common organ m which metastatic colon cancer is found. Patients 
with Dukes D colon cancer have a survival rate of less than 5% over a five year period, 
regardless of the treatment regimen. In general, colon neoplasia develops through one of at least 
three different pathways, termed chromosomal instability, microsatellite instability, and the CpG 
island methylator phenotype (CIMP). Although there is some overly, these pathways tend to 
present somewhat different biological behavior. By understanding the pathway of tumor 
development, the target genes involved, and the mechanisms underlying the genetic instability, it 
is possible to implement strategies to detect and treat the diffeent types of colon neoplasias. 

In one aspect, this application is based at least in part, on the recognition that certain 
target genes may be silenced or inactivated by the differential methylation of CpG islands in the 
5' flanking or promoter regions of the target gene. CpG islands are clusters of cytosine- 
guanosine residues in a DNA sequence, that are prominently represented in the 5-flanking region 
or promoter region of about half the genes in our genome. In particular, this apphcation is based 
at least in part on the recognition that differential methylation of the SLC5A8 nucleotide 
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sequence may be indicative of a cancer (e.g., colon cancer, breast cancer, thyroid cancer, or 
stomach cancer). 

As noted above, early detection of colon neoplasia, coupled with appropriate 
mtervention, is important for increasing patient survival rates. Present systems for screening for 
colon neoplasia are deficient for a variety of reasons, including a lack of specificity and/or 
sensitivity (e.g., Fecal Occult Blood Test, flexible sigmoidoscopy) or a high cost and intensive 
use of medical resoxirces (e.g., colonoscopy). Altemative systems for detection of colon 
neoplasia would be usefiil in a wide range of other clinical circumstances as well. For example, 
patients who receive surgical andyor pharmaceutical Hierapy for colon cancer may experience a 
relapse. It would be advantageous to have an altemative system for determining whether such 
patients have a recurrent or relapsed colon neoplasia. As a further example, an altemative 
diagnostic system would facilitate monitoring an increase, decrease or persistence of colon 
neoplasia in a patient known to have a colon neoplasia. A patient undergoing chemotherapy 
maybe monitored to assess the effectiveness of the therapy. 

In another aspect, the invention is also based, in part, on the discovery of a novel 
polynucleotide sequence encoding a novel sodium/solute symporter-like protein (SLC5A8). In 
particular, SLC5 A8 is closely related to the human sodium iodide sympoiter (SLC5A5) and the 
human sodiimi-dependent multivitamin transporter (SLC5A6). 

Cell surface receptors and transmembrane transporter systems facilitate communication 
between cells and their environment by direct exchange of chemicals between the intracellular 
and extracellular milieu. Distinct transporter systems (also called permeases, porters, 
transporters, carriers, and channel proteins) are specific for ions, small and medium size solutes 
and macromolecules. A major class of transporter proteins couple solute transport to the 
movement of other species (often cations, such as protons and sodium ions) either in the same 
direction (cotransporter or symporter) or in the opposite direction (coimter transporter or 
antiporter). Sodiiun/solute symport is a widespread mechanism of solute transport across 
cytoplasmic membranes of prokaryotic and eukaryotic cells. Proteins that catalyze 
sodium/solute symport have been grouped into eleven families based on their degree of 
sequence similarities, their solute and cation specificities, size, topographical features, and 
evolutionary relationships (see, e.g., Reizer et al., (1994) Bichemica et Biphysica Acta, 
1197:133-166). There are mixed famiUes of transporters whose members differ in the choice of 
*T jQj^ Qj. catalyze symport or antiport processes. 
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Human sodium iodide transporter (NIS, or SLC5A5) is a best characterized member 
among the sodium/solute symporter superfanuly. NIS locaUzes at the basolatcral membrane and 
catalyses the active transport of iodide from blood into the ceUs using the inwardly directed 
sodium gradient with a 2 sodium 1 iodide stoichiometry. The tissue distribution of NIS includes 
the thyroid, salivary glands, stomach, thymus, and breast. Lower levels of expression of NIS are 
detected in the prostate, ovary, adrenal gland, lung, and heart. By contrast, the NIS gene has not 
been detected in the colon, orbital fibroblasts, or nasopharyngeal mucosa (see, e.g., Filetti et al., 
1999, Eur J Endocrinol. 141:443-457). Abnormal NIS expression and/or iodide transport 
activity have been linked to many thyroid diseases including autoimmune thyroid diseases, 
thyroid nodular hyperplasia, thyroid adenoma, thyroid carcinoma, and congenital 
hypothyroidism, as weU as non-thyroid diseases such as breast cancer and stomach cancer 
(Chung, 2002, J Nucl Med 43:1 188-200). 

Besides sequence homology to the human sodium iodide transporter, SLC5A8 transcript 
was found by Applicants to be expressed in the normal colon mucosa, kidney, lung, esophagus, 
smaU bowel, stomach, thyroid, and uterus, hi addition, Applicants found fliat SLC5AS may 
function as a sodium iodide transporter, and that differential methylation of SLC5A8 and/or 
reduced expression of SLC5A8 are linked to diseases such as colon cancer, breast cancer, and 
stomach c ancer. A ccordingly, the present invention relates to methods and compositions for 
detecting and treating such SLC5 A8 associated cancers. 

m. ST jHSAS polypeptides 

In certain aspects, the invention provides a full-length SLC5A8 polypeptide (SEQ ID 
NO: 1) and functional variants thereof Preferred functional variants of SLC5A8 polypeptides 
are those that have tumor suppressor activity or sodium transporter activity. In certain aspects, 
the present invention includes biologically-active fragments of the SLC5A8 protein and fusion 
proteins including at least a portion of the SLC5A8 protein. These include proteins with 
SLC5A8 activity that have amino acid substitutions or have sugars or other molecules attached 
to amino acid functional groups. 

In certain embodiments, the present disclosure makes available isolated and/or purified 
forms of the SLC5A8 polypeptides, which are isolated firom, or otherwise substantiaUy fi:ee of, 
other proteins which might normally be associated with the protem or a particular complex 
including the protein, hi cratain embodiments, variant polypeptides have an amino acid 
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sequeace tiiat is at least 75% identical to an amino acid sequeaace as set forth in SEQ ID NO: 1. 
In other embodiments, the variant polypeptide has an amino arid sequence at least 80%, 85%, 
90%, 95%, 97%, 98%, 99% or 100% idaitical to an amino acid sequaice as set forth in SEQ ID 
N0:1. 

In certain aspects, variant SLC5A8 polypeptides are agonists or antagonists of the 
SLC5A8 polypeptide as set forth in SEQ ID NO: 1. Variants of these polypeptides may have a 
hyperactive or constitutive activity, or, act to prevent the tumor suppressor activity or sodium 
transporter activity of SLC5A8. For example, a truncated form lacking one or more domain 
may have a dominant negative effect 

In certain aspects, isolated peptidyl portions of the SLC5AS polypeptide can be obtained 
by screening polypeptides recombinantly produced from the corresponding fragment of the 
nuclric acid encoding the polypeptide as set forth in SEQ ID NO: 1 . In addition, fragments can 
be chemically synthesized using techniques known in the art such as conventional Merrifield 
solid phase f-Moc or t-Boc chemistry. The fragments can be produced (recombinantly or by 
chemical synthesis) and tested to identify those pq>tidyl fragments which can function as eiflier 
agonists or antagonists of the SLC5A8 activity (e.g., tumor suppressor or sodium solute 
synq)orter). 

The SLC5A8 protein is a transmembrane protein, with portions of the protein that are 
positioned outside the ceU (the extracellular portions) and portions of the protein that are 
positioned inside the cell (the intracellular portions). Sequences and positions of flie predicated 
thirteen transmembrane domains (TMl- TM13) are listed below. 

TMl (residues 10-32): FWWDYWFAGMLVISAAIGIYY (SEQ ID NO: 19) 

TM2 (residues 52-74): MTAVPVALSLTASFMSAVTVLGT (SEQ ID NO: 20) 

TM3 (residues 84-106): IFSIFAFTYFFWVISAEVFLPV (SEQ ID NO: 21) 

TM4 (residues 127-149): VRLCGTVLFIVQTILYTGIVIYA (SEQ ID NO: 22) 

TM5 (residues 164-186): GAWATGWCTFYCTLGGLKAVI (SEQ ID NO: 23) 

TM6 (residues 193-215): IGMVAGFASVnQAWMQGGIS (SEQ ID NO: 24) 

'7 (residues 240-259): HTFAVTIEGGTFTWTSIYGV (SEQ ID NO: 25) 



-25- 



wo 03/104427 



PCT/US03/18239 



TM8 (residues 280-302): LYINLVGLWAILTCSVFCGLALY (SEQ ID NO: 26) 

TM9 (residues 337-359): LPGLFVACAYSGTLSTVSSSINA (SEQ E) NO: 27) 

TMIO (residues 380-402): SLSWISQGMSWYGALCIGMAAL (SEQ ID NO: 28) 

TMll (residues 412^34): AALSVFGMVGGPLMGLFALGILV (SEQ ID NO: 29) 

TM12 (residues 441-463): GALVGLMACff AISLWVGIGAQIY (SEQ ID NO: 30) 

TM13 (residues 519-541): LSYLYFSTVGTLVTLLVGILVSL (SEQ ID NO: 31) 

Thus, certain embodiments of the invention include SLC5A8 fragments comprising a 
transmembrane domain as set forth in any of SEQ ID NOs: 19-21. In other embodiments, the 
present invention includes SLC5A8 fragments comprising an intracellular domain or an 
extracellular portion of the SLC5 A8 protein. 

In certain aspects, variant SLC5A8 polypeptides containing one or more fusion domains. 
Well known examples of such ftision domains include, for example, polyhistidine, Glu-Glu, 
glutathione S transferase (GST), thioredoxin, protein A, protein G, and an iimnunoglobulin 
heavy chain constant region (Fc), maltose binding protein (MBP), which are particularly useful 
for isolation of die fusion polypeptide by affinity chromatography. For the purpose of affinity 
purification, relevant matrices for affinity chromatography, such as glutathione-, amylase-, and 
nickel- or cobalt- conjugated resins are used. Many of such matrices are available in *Tdf' form, 
such as the Pharmacia GST purification system and the QIAexpress™ system (Qiagen) usefiil 
with (HlSfi) fusion partners. Another fusion domain well known in the art is green fluorescent 
protein (GFP). This fusion partner serves as a fluorescent "tag'' which allows the fusion 
polypeptide of the invention to be identified by fluorescence microscopy or by flow cytometry. 
The GFP tag is useful when assessing subcellular localization of the fusion SLC5A8 
polypeptide. The GFP tag is also iiseful for isolating cells which express the fusion SLC5A8 
polypeptide by flow cytometric methods such as a fluorescence activated cell sorting (FAGS). 
Fusion domains also include "epitope tags,** which are usually short peptide sequences for which 
a specific antibody is available. Well known epitope tags for which specific monoclonal 
antibodies are readily available include FLAG, influenza virus haemagglutinin (HA), and c-myc 
tags. In some cases, the fiision domains have a protease cleavage site, such as for Factor Xa or 
Thrombin, which allow the relevant protease to partially digest the fusion SLC5A8 polypeptide 
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and thereby liberate Ihe recombinant polypeptide therefrom. The liberated polypeptide can then 
be isolated from the fusion partner by subsequent chiomatogrq)hic separation. 

Different elements of fusion proteins may be arranged in any manner that is consistent 
with the desired functionality. For example, a SLC5A8 polypeptide may be placed C-termmal 
to a heterologous domain, or. alternatively, a heterologous domain may be placed C-terminal to 
a SLC5A8 polypeptide. The SLC5A8 and the heterologous domain need not be adjacent in a 
fusion protein, and additional domains or amino acid sequences may be included C- or N- 
temiinal to either domain or between the domains. 

It is also possible to modify the structure of the subject SLC5A8 polypeptides for such 
purposes as enhancing therapeutic or prophylactic efficacy, or stabiUty (e.g., ex vivo shelf Ufe 
and resistance to proteolytic degradation in vivo). Such modified polypeptides, when designed 
to retain at least one activity of the naturally occurring form of the protein, are considered 
functional equivalents of the SLC5A8 polypeptides described in more detaU herein. Such 
modified polypeptides can be produced, for instance, by ammo acid substitution, deletion or 
addition. 

For instance, it is reasonable to expect, for example, that an isolated replacement of a 
leucine with an isoleucine or valme, an aspartate with a glutamate. a threonine with a serine, or a 
similar replacement of an amino acid with a structurally related amino acid (i.e., conservative 
mutations) wUl not have a major effect on the biological activity of the resulting molecule. 
Conservative replacements are those that take place within a family of amino acids that are 
related in their side chains (see, for example. Biochemistry, 2nd ed.,Ed.by L.Stryer.WiL 
Freeman and Co., 198 1). Whether a change in the amino acid sequence of a polypeptide results 
in a functional homolog can be readily determined by assessing the ability of the variant 
polypeptide to produce a response in cells in a fashion similar to the wild-type protein. For 
instance, such variant forais of a SLC5A8 polypeptide can be assessed, e.g., for their abiUty to 
transport sodium solute or their ability to suppress tumor formation. Polypeptides in which 
more than one replacement has taken place can readily be tested in the same manner. 

This invention fiirther contemplates a method of generating sets of combinatorial 
mutants of the SLC5A8 polypeptides, as well as truncation mutants, and is especially usefiil for 
identifying potential variant sequences (e.g.. homologs) that are functional in binding to a 
SLC5A8 polypeptide. The purpose of screening such combinatorial libraries may be to 
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gearaate, for example, SLC5A8 homologs which can act as either agonists or antagonist, or 
alternatively, which possess novel activities all together. Combinatorially-derived homologs can 
be generated which have a selective potency relative to a naturally occurring SLC5A8 
polypeptide. Such proteins, when expressed from recombinant DNA constructs, can be used in 
gene therapy protocols. Likewise, mutagenesis can give rise to variants which have intracellular 
half-Uves dramatically dijfferent than the corresponding wild-type protein. For example, the 
altered protein can be rendered either more stable or less stable to proteolytic degradation or 
other cellular process which result ia destruction of; or otherwise inactivation of the SLC5A8 
polypeptide of interest. Such variants, and the genes which encode them, can be utilized to alter 
SLC5A8 levels by modulating the half-life of the protein. For instance, a short half-life can give 
rise to more transient biological eflFects and, when part of an inducible expression system, can 
allow tighter control of recombinant SLC5A8 levels within the cell. As above, such proteins, 
and particularly their recombinant nucleic acid constructs, can be used in gene therapy protocols. 
In similar fashion, SLC5A8 homologs can be generated by the present combinatorial approach 
• to act as antagonists, in that they are able to interfere with the ability of the corresponding wild- 
type protein to function. 

In a representative embodiment of this method, the amino acid sequences for a 
population of SLC5A8 homologs are aligned, preferably to promote the highest homology 
possible. Such a population of variants can include, for example, homologs from one or more 
species, orhomologsfromthesamespeciesbut whichdifferduetomutation. Amino acids 
which appear at each position of the aligned sequences may be selected to create a degenerate 
set of combinatorial sequences. In a preferred embodiment, the combmatorial library is 
produced by way of a degenerate library of genes encoding a library of polypeptides which each 
mclude at least a portion of potential SLC5A8 sequences. For instance, a mixture of synthetic 
oUgonucleotides can be enzymatically hgated into gene sequences such that the degenerate set 
of potential SLC5A8 nucleotide sequences are expressible as individual polypeptides, of 
alternatively, as a set of larger fusion proteins (e.g., for phage display). 

There are many ways by which the library of potential homologs can be generated from a 
degenerate oUgonucleotide sequence. Chemical synthesis of a degenerate gene sequence can be 
carried out in an automatic DNA synthesizer, and the synthetic genes then be ligated into an 
appropriate gene for expressiorL The purpose of a degenerate set of genes is to provide, in one 
mixture, all of the sequences encoding the desired set of potential SLC5A8 sequences. The 
•f degenerate oUgonucleotides is well known in the art (see for example, Narang, SA 
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(1983) Tetrahedron 39:3; Itakura et al., (1981) Recombinant DNA, Pioc. 3rd Cleveland Sympos. 
Macromolecules, ed. AG Walton, Amsterdam: Elsevier pp273-289; Itakura et al., (1984) Aimn. 
Rev. Biochem. 53:323; Itakura et al., (1984) Science 198:1056; Dee et al., (1983) Nucleic Acid 
Res. 11:477). Such techniques have been employed in the directed evolution of other proteins 
(see, for example, Scott et al., (1990) Science 249:386-390; Roberts et al., (1992) PNAS USA 
89:2429-2433; Devlin et al., (1990) Science 249: 404406; Cwirla et al., (1990) PNAS USA 87: 
6378-6382; as well as U.S. Patent Nos: 5,223,409, 5,198,346, and 5,096,815). 

Alternatively, other forms of mutagenesis can be utilized to generate a combinatorial 
library. For example, SLC5A8 variants (both agonist and antagonist fonns) can be generated 
and isolated from a library by screening using, for example, alanine scanning mutagenesis and 
the Uke (Ruf et al., (1994) Biochemistry 33:1565-1572; Wang et al., (1994) J. Biol. Chem. 
269:3095-3099; Balint et al., (1993) Gene 137:109-118; Grodberg et al., (1993) Eur. J. 
Biochem. 218:597-601; Nagashima et al., (1993) J. Biol. Chem. 268:2888-2892; Lowman et al., 

(1991) Biochemistry 30:10832-10838; and Cunningham et al., (1989) Science 244:1081-1085), 
by Unker scanning mutagenesis (Gustin et al., (1993) Virology 193:653-660; Brown et al., 

(1992) Mol. Cell Biol. 12:2644-2652; McKnight et al., (1982) Science 232:316); by saturation 
mutagenesis (Meyers et al., (1986) Science 232:613); by PCR mutagenesis (Leung et al., (1989) 
Method Cell Mol Biol 1:11-19); or by random mutagenesis, including chemical mutagenesis, 
etc. (Miller et al., (1992) A Short Course in Bacterial Genetics, CSHL Press, Cold Spring 
Harbor, NY; and Greener et al., (1994) Strategies in Mol Biol 7:32-34). Linker scamiing 
mutagenesis, particularly in a combinatorial setting, is an attractive method for identifying 
truncated (bioactive) forms of SLC5A8 polypeptides. 

A wide range of techniques are known in the art for screening gene products of 
combinatorial libraries made by point mutations and truncations, and, for that matter, for 
screening cDNA libraries for graie products having a certain property. Such techniques will be 
generally adaptable for rapid screening of the gene Ubraries generated by the combinatorial 
mutagenesis of SLC5A8 variants. The most widely used techniques for screening large gene 
Ubraries typically comprises cloning the gene library into replicable expression vectors, 
transforming ^propriate cells with the resulting library of vectors, and expressing the 
combinatorial genes under conditions in which detection of a desired activity facilitates 
relatively easy isolation of the vector encoding the gene whose product was detected. Each of 
the illustrative assays described below are amenable to high through-put analysis as necessary to 
e numbers of degenerate sequences created by combinatorial mutagenesis techniques. 
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In an illustrative embodiment of a screening assay, candidate combinatorial gene 
products of one of the subject proteins are displayed on the surface of a cell or virus, and the 
ability of particular cells or viral particles to bind a SLC5A8 polypeptide is detected in a 
'Spanning assay." For instance, a library of SLC5A8 variants can be cloned into the gene for a 
surface membrane protein of a bacterial cell (Ladner et al.„ WO 88/06630; Fuchs et al., (1991) 
Bio/Technology 9:1370-1371; and Goward et al., (1992) TIBS 18:136-140), and the resulting 
fusion protein detected by panning, e.g., using a fluorescently labeled molecule which binds the 
SLC5A8 polypeptide, to score for potentially functional homologs. Cells can be visually 
inspected and separated under a fluorescence microscope, or, where the morphology of the cell 
permits, separated by a fluorescence-activated cell sorter. 

In similar fashion, the gene library can be expressed as a fusion protein on the surface of 
a viral particle. For instance, in the filamoitous phage system, foreign peptide sequences can be 
expressed on the surface of infectious phage, thereby conferring two significant benefits. First, 
since these phage can be applied to aJBBnity matrices at very high concentrations, a large number 
of phage can be screened at one time. Second, since each infectious phage displays the 
combinatorial g ene p roduct o n i ts surface, i f a p articular p hage i s recovered f rom an aflBnity 
matrix in low yield, the phage can be amplified by anotho- round of infection. The group of 
ahnost identical E. coli filamentous phages M13, fd, and fl are most often used in phage display 
libraries, as either of the phage gin or gVIU coat proteins can be used to generate fusion proteins 
without disrupting the ultimate packaging of the viral particle (Ladner et al., PCT publication 
WO 90/02909; Gaixaid et al., PCT publication WO 92/09690; Marks et al., (1992) J. Biol. 
Chem. 267:16007-16010; Griffiths et al., (1993) EMBO J. 12:725-734; Clackson et al., (1991) 
Nature 352:624-628; and Barbas et al., (1992) PNAS USA 89:4457-4461). 

In certain embodiments, the invention also provides for reduction of the subject SLC5A8 
polypeptides to generate mimetics, e.g., peptide or non-peptide agents, which are able to mimic 
binding of the authentic protein to another cellular partner. Such mutagenic techniques as 
described above, as well as the thioredoxin system, are also particularly usefiil for mapping the 
determinants of a SLC5A8 polypeptide which participate in protein-protein intaractions 
involved in, for example, binding of proteins involved in angiogenesis to each other. To 
illustrate, the critical residues of a SLC5A8 polypeptide which are involved in molecular 
recognition of a substrate protein can be determined and used to generate SLC5A8 polypeptide- 
derivcd pcptidomimetics which bind to the substrate protein, and by inhibiting SLC5A8 binding, 
)it its biological activity. By employing, for example, scanning mutagenesis to map 
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the amino acid residues of a SLC5A8 polypqjtide which are involved in binding to another 
polypeptide, peptidomimetic compounds can be generated which mimic those residues involved 
in binding. For instance, non-hydrolyzable peptide analogs of such residues can be generated 
using benzodiazepine (e.g., see Freidinger et al., in Peptides: Chemistry and Biology, G.R. 
Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988), azepine (e.g., see Huf&nan et al., 
in Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM Publisher: Leiden, Netherlands, 
1988), substituted gamma lactam rings (Garvey et al., in Peptides: Chemistry and Biology, G.R. 
Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988), keto-methylene pseudopeptides 
(Ewenson et al., (1986) J. Med. Chem. 29:295; and Ewenson et al., in Peptides: Structure and 
Function (Proceedings of the 9th American Peptide Syn?)Osium) Pierce Chemical Co. Rockland, 
IL, 1985), b-tum dipeptide cores (Nagai et al., (1985) Tetrahedron Lett 26:647; and Sato et al., 
(1986) J Chem Soc Perkin Trans 1:1231), and b-aminoalcohols (Gordon et al., (1985) Biochem 
Biophys Res Commun 126:419; and Dann et al., (1986) Biochem Biophys Res Commun 
134:71). 

In certain embodiments, the SLC5A8 polypeptides may further comprise post- 
translational or non-amino acid elements, such as hydrophobic modifications (e.g., polyethylene 
glycols or lipids), poly- or mono-saccharide modifications, phosphates, acetylations, etc. Effects 
of such elements on the functionality of a SLC5A8 polypeptide may be tested as described 
herein for other SLC5 A8 variants. 

In certain aspects, the present invention contanplates directly delivery of SLC5A8 
polypeptides into a cell. Methods of directiy introducing a polypeptide mto a cell include, but 
are not limited to, protein transduction and protein therapy. For example, a protem transduction 
domain (PTD) can be fused to a nucleic acid encoding a SLC5 A8 protein, and the fusion protein 
is e>q)ressed and purified. Fusion proteins containing the PTD are permeable to the cell 
membrane, and thus cells can be directly contacted with a fusion protein (Derossi et al. (1994) 
Journal of Biological Chemistry 269: 10444-10450; Han et al. (2000) Molecules and Cells 6: 
728-732; Hall et al. (1996) Current Biology 6: 580-587; Theodore et al. (1995) Journal of 
Neuroscience 15: 7158-7167). 

Although some protein transduction based methods rely on fusion of a polypeptide of 
interest to a sequence which mediates introduction of the protein into a cell, other protein 
transduction methods do not require covalent linkage of a protein of interest to a transduction 
/^r^rr^a;r> \x Icast two commercially available reagents exist that mediate protein transduction 
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without covalent modification of the protein (Chariot™, produced by Active Motif, 
www.activemotif.coni and Bioporter® Protein Dehvery Reagent, produced by Gene Therapy 
Systems, www.genetherapysystems.com). Briefly, these p rotein transduction reagents c an b e 
used t o d eliver p roteins, p eptides a nd antibodies d irectly t o c ells i ncluding m ammalian c ells. 
Dehvery of proteins directly to cells has a number of advantages. Firstly, many current 
techniques of gene delivery arebasedondeUveiy ofa nucleic acid sequence which must be 
transcribed and/or translated by a cell before expression of the protein is achieved. This results 
in a time lag between delivery of the nucleic acid and expression of the protein. Direct dehvery 
of a protein decreases this delay. Secondly, delivery of a protein often results in transient 
expression of the protein in a cell 

As outlined herein, protein transduction mediated by covalent attachment of a PTD to a 
protein can be used to deliver a protein to a cell. These methods require that individual proteins 
be covalently appended with PTD moieties. In contrast, methods such as Chariot™ and 
Bioporter® facilitate transduction by forming a noncovalent interaction between the reagent and 
the protein. Without being bound by theory, these reagents are thought to facilitate transit of the 
cell membrane, and following internalization into a cell ttie reagent and protein complex 
disassociates so that the protein is free to function in the cell. 

IV. SLC5A8 nucleic acids 

In certain aspects, the invention provides isolated and/or recombinant SLC5A8 nucleic 
acids encoding SLC5A8 polypeptides, for example, SBQ ID NOs: 3 and 4. The SLC5A8 
polynucleotides may be single-stranded or double stranded. Such nucleic acids may be DNA or 
RNA molecules. The SLC5A8 nucleic acids are useful as diagnostic or then^eutic agents, such 
as for example, these nucleic acid molecules encode the SLC5A8 protein, and are useful in 
assaying for the presence of SLC5A8 transcripts in cancer cells (e.g., colon cancer cells, breast 
cancer cells, thyroid cancer cells, or stomach cancer cells). 

SLC5A8 nucleic acids of the invention are further understood to include nucleic acids 
that comprise variants of SEQ ID NOs: 3 and 4. Variant nucleotide sequences include r 
sequences t hat d iffer b y o ne o r m ore n ucleotide s ubstitutions, a dditions or d eletions, s uch a s 
allelic variants; and will, therefore, include coding sequences that differ from the nucleotide 
sequence of the coding sequence designated in SEQ ID NOs: 3 and 4. Optionally, a SLC5A8 
nucleic acid of the invention will genetically complement a partial or complete SLC5A8 loss of 
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function phenotype. For example, a SLC5A8 nucleic acid of the invention may be expressed in 
a cell in which the endogenous SLC5A8 gene has been deleted, and the introduced SLC5A8 
nucleic acid will mitigate a phenotype resulting from the gene deletion. 

The present invention is based, at least in part, on the observation that SLC5A8 
nucleotide sequences can be differentially methylated in certain SLC5A8-associated cancer, 
such as colon cancer, breast cancer, thyroid cancer or stomach cancer. Accordingly, certain 
aspects of the present invention provide SLC5A8 nucleic acids having certain regions that axe 
differentially methylated in SLC5A8-associated cancer, for example, SEQ ID NOs: 12, 13, and 
14, and fragments thereof Detection of methylation in any one of such differentially methylated 
nucleic acid sequences would be indicative of a SLC5A8-associated cancer. 

In certain embodiments, the application provides isolated or recombinant SLC5A8 
nucleic acid sequences that are at least 80%, 85%, 90%, 95%, 97%, 98%, 99% or 100% identical 
to tiie SLC5A8 nucleic acid sequences (e.g., SEQ ID NOs: 3-4 and 12-14). One of ordinary skill 
in the art will ^preciate that SLC5A8 nucleic acid sequences complementary to SEQ ID NOs: 
3-4 and 12-14, and variants of SEQ ID NOs: 3-4 and 12-14 are also within the scope of this 
invention. In further embodiments, the SLC5A8 nucleic acid sequences of the invention can be 
isolated, recombinant, and/or fused with a heterologous nucleotide sequence, or in a DNA 
library. 

In other embodiments, SLC5A8 nucleic acid sequences also include nucleotide 
sequences that hybridize under highly stringent conditions to the nucleotide sequences 
designated in SEQ ID NOs: 3-4 and 12-14, or fragments thereof As discussed above, one of 
ordinary skill in the art will understand readily that appropriate stringency conditions which 
promote DNA hybridization can be varied. One of ordinary skill in the art will understand 
readily that appropriate stringency conditions which promote DNA hybridization can be varied. 
For example, one could perform the hybridization at 6.0 x sodium chloride/sodium citrate (SSC) 
at about 45 °C, followed by a wash of 2.0 x SSC at 50 °C. For example, the salt concentration 
in the wash step can be selected from a low stringency of about 2.0 x SSC at 50 °C to a high 
stringency of about 0.2 x SSC at 50 ''C. In addition, the temperature in tiie wash step can be 
increased from low stringency conditions at room tanperature, about 22 °C, to high stringency 
conditions at about 65 '^C. Both temperature and salt may be varied, or temperature or salt 
concentration may be held constant while the other variable is changed. In one embodiment, the 
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invoition provides nucleic acids which hybridize under low stringency conditions of 6 x SSC at 
room temperature followed by a wash at 2 x SSC at room temperature. 

Isolated SLC5A8 nucleic adds which dififer from the nucleic acids (e.g., SEQ ID NOs: 
3-4 and 12-14) due to degeneracy in the genetic code are also within the scope of the invention. 
For example, a number of amino acids are designated by more than one triplet Codons that 
specify the same amino acid, or synonyms (for example, CAU and CAC are synonyms for 
histidine) may result in "silmt" mutations which do not affect the amino acid sequence of the 
protein. However, it is expected that DNA sequence polymorphisms that do lead to changes in 
the amino acid sequences of the subject proteins will exist among mamanalian cells. One skilled 
in the art will appreciate that these variations in one or more nucleotides (up to about 3-5% of 
the nucleotides) of the nucleic adds aicoding a particular protein may exist among individuals 
of a given species due to natural allelic variation. Any and all such nucleotide variations and 
resulting amino acid polymorphisms are within the scope of this invention. 

In certain embodiments, tihe recombinant SLC5 A8 nucleic acid may be operably linked 
to one or more regulatory nucleotide sequences in an expression constmct Regulatory 
nucleotide sequences will generally be appropriate for a host cell used for ejqpression. 
Numerous types of appropriate expression vectors and suitable regulatory sequences are known 
in the art for a variety of host cells. Typically, said one or more regulatory nucleotide sequences 
may include, but are not limited to, promoter sequences, leader or signal sequences, ribosomal 
binding sites, transcriptional start and termination sequences, translational start and termination 
sequences, and enhancer or activator sequraces. Constitutive or indudble promoters as known 
in the art are contemplated by the inventiorL The promoters may be dther naturally occurring 
promoters, or hybrid promoters that combine elements of more than one promoter. An 
expression construct may be present in a cell on an episome, such as a plasmid, or the expression 
construct may be inserted in a chromosome. In a preferred embodiment, the expression vector 
contains a selectable marker gene to allow the selection of transformed host cells. Selectable 
marker genes are well known in the art and will vary with the host cell used. 

In certain aspects, the application provides methylated forms of SLC5A8 nucldc acid 
sequences of SEQ ID NOs: 12-14 or fragments thereof, wherein the cytosine bases of the CpG 
islands present in said sequences are methylated. In other words, the SLC5A8 nucleic acid 
sequences of the present invention may be either in the methylated status (e.g., as seen in 
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SLCSAS-associated cancer tissues) or in the uranethylated status (e.g., as seen in noimal 
tissues). 

In certain embodiments, the present invention provides bisulfite-converted SLC5A8 
template DNA sequences, for example, SEQ ID NOs: 15-18, and fragments thereof. Such 
bisulfite-converted SLC5A8 template DNA can be used for detecting the methylation status, for 
example, by an MSP reaction or by direct sequencing. These bisulfite-converted SLC5A8 
sequences are also of use for designing primers for MS-PCR reactions that specifically detect 
methylated or unmethylated SLC5A8 templates foUowing bisulfite conversion. In yet other 
embodiments, the bisulfite-converted SLC5A8 nucleotide sequences of the invention also 
include nucleotide sequences that wiU hybridize under highly stringent conditions to any 
nucleotide sequence selected from SEQ ID NOs: 15-18. hi fiirther aspects, the application 
provides methods for producmg such bisulfite- converted nucleic acid sequences, for example, 
the application provides methods for treating a nucleotide sequence with a bisulfite agent such 
that the unmethylated cytosine bases are converted to a different nucleotide base such as a uracil. 

The present invention also provides primers which can be used in PGR to obtain the 
SLC5A8 nucleic acids from cDNA The present invaition also encompasses oligcmucleotides 
that are useful as hybridization probes for detecting transcripts of tiie genes which encode the 
SLC5A8 protein Preferably, such oligonucleotides comprise at least 200 nucleotides. Such 
hybridization probes have a sequence which is at least 90% complementary with a contiguous 
sequence contained within flie sense strand or antisense strand of a double stranded DNA 
molecule which encodes the SLC5A8 protein. Such hybridization probes bind to tiie sense 
strand or antisense under stringent conditions, preferably under highly stringent conditions. The 
probes are used in Northern assays to detect transcripts of SLC5A8 homologous genes and m 
Southern assays to detect SLC5A8 homologous genes. The identity of probes which are 200 
nucleotides in length and have full complementarity with a portion of the sense or antisense 
strand of a double-stranded DNA molecule which encodes tiie SLC5A8 protein as set forth in 
SEQ ID NO: 1. 

The various Sequence Identification Numbers that have heea used in this application are 
summarized below in Table 1 . 

Table 1. Sequence Identification Numbers that have berai used in this application. 
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SEQ ID NO 


Description/ Name 


Corresponding Figure 


1 


amino acid sequence of human SLC5A8 protein. 


Figure 18. 


2 


genomic clone AC063951. Nucleotides 82200- 
83267 encompasses the promoter and/or exon 1 
of the SLC5A8 gene, and referred to as the 
"SLC5 A8 methylation target region." 


Figure 1. 


3 


nucleotide sequence of the SLC5A8 mKNA 
transcript. 


Figure 2. 


4 


nucleotide sequence of the SLC5A8 coding 
region. 


Figure 23B. 


5 


3D41-Hpa2-190R 


N/A. 


6 


3D41-Hpa2-633F 


N/A. 


7 


3D41-Hpa2-82430F 


N/A. 


8 


AS-umneth-442s 


N/A. 


9 


AS-uraneth-542as 


N/A. 


10 


AS-meth-442-459s 


N/A. 


11 


AS-meth-550as 


N/A. 


12 


nucleotides 82200-83267 of AC063951, wild- 
type, sense strand. 


Figure 4. 


13 


nucleotides 82200-83267 of AC063951, wild- 
type, antisense strand. 


Figure 8. 


14 


nucleotides 300-600 of SEQ ID NO: 12, wild- 
type, antisense strand. 


Figure 9. 


15 


nucleotides 82200-83267 of .AC063951, 
antisense strand, bisulfite-converted/methylated. 


Figure 10. 


16 


nucleotides 82200-83267 of AC063951, 
antisense strand, bisulfite- 
converted/umnethylated. 


Figure 11. 


17 


nucleotides 82200-83267 of AC063951, sense 
strand, bisulfite-converted/methylated. 


Figure 12. 


18 


nucleotides 82200-83267 of AC063951, sense 
strand, bisulfite-converted/unmethylated. 


Figure 13. 
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V. SLC5 A8 Expression Vectors 

In certain aspects, nucleic acids encoding SLC5A8 polypeptides and variants thereof 
may be used to increase SLC5A8 expression in an organism or cell by direct delivery of the 
nucleic acid A nucleic acid therapy construct of the present invention can be delivered, for 
example, as an expression plasmid which, when transcribed in the cell, produces RNA which 
encodes a SLC5A8 polypeptide. 

Iq another aspect of the invention, the subject nucleic acid is provided in an expression 
vector comprising a nucleotide sequence encoding a subject SLC5A8 polypeptide and operably 
Unked to at least one regulatory sequence. Regulatory sequences are art-recognized and are 
selected to direct expression of flie SLC5A8 polypeptide. Accordmgly, the term regulatory 
sequence includes promoters, enhancers, and other expression control elements. Exemplary 
regulatory sequences are described in Goeddel; Gene Expression Technology: Methods in 
Enzymology, Academic Press, San Diego, CA (1990). For mstance, any of a wide variety of 
expression control sequences that control the expression of a DNA sequence when operatively 
linked to it may be used in these vectors to express DNA sequences encoding a SLC5A8 
polypeptide. Such useful expression control sequences, include, for example, the early and late 
promoters of S V40, tet promoter, adenovirus or cytomegalovirus immediate early promoter, the . 
lac system, the tip system, the TAG or TRC system, T7 promoter whose expression is dnrected 
by T7 RNA polymerase, the major operator and promoter regions of phage lambda , the control 
regions for fd coat protein, the promoter for 3-phospho^ycerate kinase or other glycolytic 
enzymes, the promoters of acid phosphatase, e.g., Pho5, the promoters of the yeast a-mating 
fectors, the polyhedron promoter of the baculovirus system and other sequences known to 
control the expression of genes of prokaryotic or eukaryotic cells or their vhiises, and various 
combmations thereof. It should be understood that the design of the expression vector may 
depend on such factors as the choice of the host cell to be transformed and/or the type of protein 
desured to be expressed. Moreover, the vector's copy number, the abiUty to control that copy 
number and the expression of any other protein encoded by the vector, such as antibiotic 
markers, should also be considered. 

As will be apparent, the subject gene constructs can be used to cause expression of the 
subject SLC5A8 polypeptides in cells propagated in culture, e.g., to produce proteins or 
polypeptides, including fusion proteins or polypeptides, for purification. 
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This invention also pertsiins to a host cell transfected with a recombinant gene including 
a coding sequence for one or more of the subject SLC5A8 polypeptides. The host cell may be 
any prokaryotic or eukaryotic celL For example, a polypeptide of the present invention may be 
expressed in bacterial cells such as E. colU insect cells (e.g., using a baculovirus expression 
system), yeast, or TngmTnaHaTi cells. Other suitable host cells are known to those skilled in the 
art. 

Accordingly, the present invention further pertains to methods of producing the subject 
SLC5A8 polypeptides. For example, a host cell transfected with an expression vector encoding 
a SLC5A8 polypeptide can be cultured imder appropriate conditions to allow expression of the 
polypeptide to occur. The polypeptide may be secreted and isolated from a mixture of cells and 
medium containing the polypeptide. Alternatively, the polypeptide may be retained 
cytoplasmically or in a membrane fraction and the cells harvested, lysed and the protein isolated. 
A cell culture includes host cells, media and other byproducts. Suitable media for cell culture 
are well known in the art. The polypeptide can be isolated from cell culture medium, host cells, 
or both using techniques known in the art for purifying proteins, including ion-exchange 
chromatography, gel filtration chromatography, ultrafiltration, electrophoresis, and 
immunoafiBnity purification with antibodies specific for particular epitopes of the polypeptide. 
In a preferred embodiment, the SLC5A8 polypeptide is a fiision protein containing a domain 
which facilitates its purification, such as a SLC5A8-GST fusion protein, SLC5A8-intein fusion 
protein, SLC5A8-cellulose binding domain fusion protein, SLC5A8-polyhistidine fusion 
protein, etc. 

A recombinant SLC5A8 nucleic acid can be produced by ligating the cloned gene, or a 
portion thereof, into a vector suitable for expression in either prokaryotic cells, eukaryotic cells 
(yeast, avian, insect or mammalian), or both. Expression vdiides for production of a 
recombinant SLC5A8 polypeptides include plasmids and other vectors. For instance, suitable 
vectors for the expression of a SLC5A8 polypeptide include plasmids of the types: pBR322- 
derived plasmids, pEMBL-derived plasmids, pEX-derived plasmids, pBTac-derived plasmids 
and pUC-derived plasmids for expression in prokaryotic cells, such as E. coli. 

The preferred mammalian expression vectors contain both prokaryotic sequences to 
facilitate the propagation of the vector in bacteria, and one or more eukaryotic transcription units 
that are expressed in eukaryotic cells. The pcDNAI/amp, pcDNAI/neo, pRc/CMV, pSV2gpt, 
n«V9n«n r)SV2-dhfr, pTk2, pRSVneo, pMSG, pSVT7, pko-neo and pHyg derived vectors are 
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examples of mammalian expression vectors suitable for transfection of eukaryotic cells. Some 
of these vectors are modified with sequences from bacterial plasmids, such as pBR322, to 
facilitate replication and drug resistance selection in both prokaryotic and eukaryotic cells. 
Alternatively, derivatives of viruses such as the bovine papilloma virus (BPV-1), or Epstein- 
Barr virus (pHEBo, pREP-derived and p205) can be used for transient expression of proteins in 
eukaryotic cells. Examples of other viral (including retroviral) expression systems can be found 
below in the description of gene therapy delivery systems. The various methods employed in 
the preparation of the plasmids and transfoimation of host organisms are well known in the art. 
For other suitable expression systems for both prokaryotic and eukaryotic cells, as well as 
general recombinant procedures, see Molecular Cloning A Laboratory Manual, 2nd Ed., ed, by 
Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press, 1989) Chapters 16 and 
17. In some instances, it may be desirable to express the recombinant SLC5A8 polypeptide by 
the use of a baculovirus expression system. Examples of such baculovirus expression systems 
include pVL-derived vectors (such as pVL1392, pVL1393 and pVL941), pActW-derived 
vectors (such as pAcUWl), and pBlueBac-derived vectors (such as the B-gal containing 
pBlueBac lH). 

In another embodiment, a fusion gene coding for a purification leader sequence, such as 
a poly-(His)/enterokinase cleavage site sequence at the N-terminus of the desired portion of the 
recombinant SLC5A8 protein, can allow purification of the expressed fiision protein by afSnity 
chromatography using a Ni^^ metal resin. The purification leader sequence can then be 
subsequently removed by treatment with enterokinase to provide the purified SLC5A8 
polypeptide (e.g., see Hochuli et al., (1987) J. Chromatography 411:177; and Janknecht et al., 
PNAS USA SS:8972). 

Techniques for making fiision genes are well known. Essentially, the joining of various 
DNA fragments coding for diflFerent polypeptide sequences is perfonned in accordance with 
conventional techniques, employing blunt-ended or stagger-ended termini for ligation, 
restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as 
appropriate, alkaUne phosphatase treatment to avoid undesirable joining, and oizymatic Hgation. 
In another embodiment, the fiision gene can be synthesized by conventional techniques 
including automated DNA synthesize. Alternatively, PGR amplifitation of gene fi:agments 
can be carried out using anchor primers which give rise to complementary overhangs between 
two consecutive gene fragments which can subsequently be annealed to generate a chimeric 
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gene sequence (see, for example. Current Protocols in Molecular Biology, eds. Ausubel et al., 
John Wiley & Sons: 1992). 

VI. Antibodies 

Another aspect of the invention pertains to an antibody reactive with a SLC5A8 
polypeptide, preferably antibodies that are specifically reactive with SIXSAS polypeptide. For 
example, by using immxmogens derived from a SLC5A8 polypeptide, anti-protein/anti-peptide 
antisera or monoclonal antibodies can be made by standard protocols (see, for example, 
Antibodies: A Laboratory Manual ei by Harlow and Lane (Cold Spring Harbor Press: 1988)). 
A mammal, such as a mouse, a hamster or rabbit can be immunized with an immunogenic fomi 
of the peptide (e.g., a SLC5A8 polypeptide or an antigenic fragment which is capable of 
eliciting an antibody response, or a fusion protein). Techniques for conferring immunogenicity 
on a protein or peptide include conjugation to carriers or other techniques well known in the art. 
An immunogenic portion of a SLC5A8 polypeptide can be administered in the presence of 
adjuvant. T he p rogress of i mmunization c an b e m onitored b y d etection o f antibody t iters i n 
plasina or serum. Standard ELISA or other immunoassays can be used with the immunogen as 
antigen to assess the levels of antibodies. In a preferred embodiment, the subject antibodies are 
immunospecific for antigenic d eterminants o f a SLC5 A8 p olypeptide a s s et forth i n S EQ ID 
NO: 1. 

In one embodiment, antibodies are specific for the SLC5A8 protein as encoded by 
nucleic acid sequences as set forth in SEQ ID NOs: 3 and 4. In other embodiments, an antibody 
is immunoreactive with one or more proteins having an amino acid sequence that is at least 85%, 
90%, 95%, 98%, 99%, 99.3%. 99.5%, 99.7% or 100% identical to an amino acid sequence as set 
forth in SEQ ID NO: 1. 

In another embodiment, antibodies of the invention are specific for the extracellular 
portion of the SLC5 A8 protein. In a set of exemplary embodiments, an antibody binds to an 
extracellular portion of SEQ ID NO: 1. In another embodhnent, antibodies of the invention are 
specific for the intracellular portion or the transmembrane portion of the SLC5A8 protein. In a 
further embodiment, antibodies of the invention are specific for the soluble SLC5A8 protein and 
variants thereof. 

Following immunization of an animal with an antigenic prq)aration of a SLC5A8 
oolvoeotide, anti-SLC5A8 antisera can be obtained and, if desired, polyclonal anti-SLC5A8 
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antibodies can be isolated from the serum. To produce monoclonal antibodies, antibody- 
producing cells (lymphocytes) can be harvested from an immunized animal and fused by 
standard somatic ceU fusion procedures with inmiortalizing cells such as myeloma cells to yield 
hybridoma cells. Such techniques are well known in the art, and include, for example, the 
hybridoma technique (origmally developed by Kohler and Milstein, (1975) Nature, 256: 495- 
497), the human B cell hybridoma technique (Kozbar et al., (1983) Immunology Today, 4: 72), 
and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al., (1985) 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc. pp. 77-96). Hybridoma cells 
can be screened immunochemically for production of antibodies specifically reactive with a 
SLC5A8 polypeptide of the present invention and monoclonal antibodies isolated from a culture 
comprising such hybridoma ceUs. In one embodiment, anti-SLC5A8 antibodies specifically 
react with the protein encoded by a nucleic acid having the sequence of SEQ ID NO: 3 or 4. 

The term "antibody" as used herein is intended to include fragments thereof which are 
also specificaUy reactive with a subject SLC5A8 polypeptide. Antibodies can be fragmented 
using conventional techniques and the fragments screened for utility in the same manner as 
described above for whole antibodies. For example, F(ab)2 fragments can be generated by 
treating antibody with pepsin. The resulting F(ab)2 fragment can be treated to reduce disulfide 
bridges to produce Fab fragments. The antibody of the present invention is further intended to 
include bispecific, single-chain, and chimeric and humanized molecules having affinity for a 
SLC5A8 polypeptide conferred by at least one CDR region of the antibody. In preferred 
embodiments, the antibody fiarther comprises a label attached thereto and able to be detected 
(e.g., the label can be a radioisotope, fluorescent compound, enzyme or enzyme co-factor). 

In certain preferred embodiments, an antibody of the invention is a monoclonal antibody, 
and in certain embodiments, the invention makes available methods for generating novel 
antibodies. For example, a method for generating a monoclonal antibody that binds specifically 
to a SLC5A8 polypeptide may comprise administering to a mouse an amount of an 
immunogenic composition comprising the SLC5A8 polypeptide effective to stimulate a 
detectable immune response, obtaining antibody-producing cells (e.g., cells from the spleen) 
from the mouse and fiismg the antibody-producing cells with myeloma cells to obtain antibody- 
producing hybridomas, and testing the antibody-produdng hybridomas to identify a hybridoma 
that produces a monocolonal antibody that binds specifically to the SLC5A8 polypeptide. Once 
obtained, a hybridoma can be propagated in a cell culture, optionally in culture conditions where 
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the hybridoma-derived cells produce the monoclonal antibody that binds specilicaUy to the 
SLC5 A8 polypeptide. The monoclonal antibody may be purified firom the cell culture. 

Anti-SLC5 A8 antibodies can be used, e.g., to detect SLC5 A8 polypeptides in biological 
samples and/or to monitor SLC5 A8 polypeptide levels in an uidividual. The level of SLC5A8 
polypeptide maybe measured in a variety of sample types such as, for example, in cells , stools, 
and/or in bodily fluid, such as in whole blood samples, blood serum, blood plasma and urine. 
The adjective "specifically reactive with" as used in reference to an antibody is intended to 
mean, as is generally understood in the art, that the antibody is sufficientiy selective between the 
antigen of interest (e.g., a SLC5A8 polypeptide) and other antigens that are not of interest that 
the antibody is useful for, atminimum, detecting thepresence of the antigen of interest in a 
particular type of biological sample. In certain methods employing the antibody, a higjier degree 
of specificity in binding may be desirable. For example, an antibody for use in detecting a low 
abundance protein of mterest in the presence of one or more very higji abundance protein that 
are not of interest may perform better if it has a higher degree of selectivity between the antigen 
of interest and other cross-reactants. Monoclonal antibodies generally have a greater tendency 
(as compared to polyclonal antibodies) to discriminate effectively between the desired antigens 
and cross-reacting polypeptides. In addition, an antibody that is effective at selectively 
identifying an antigen of interest in one type of biological sample (e.g., a stool sample) may not 
be as effective for selectively identifying the same antigen in a different type of biological 
sample (e.g., a blood sample). Likewise, an antibody that is effective at identifying an antigen 
of interest in a purified protein preparation that is devoid of other biological contaminants may 
not be as effective at identifying an antigen of mterest in a crude biological sample, such as a 
blood or urine sample. Accordingly, in preferred embodiments, the application provides 
antibodies that have demonstrated specificity for a SLC5A8 protein in a sample type that is 
hkely to be the sample type of choice for use of the antibody. lii a particularly preferred 
embodiment, the apphcation provides antibodies that bind specifically to a SLC5A8 polypeptide 
in a protein preparation firom blood (optionally serum or plasma) firom a patient that has a 
SLC5 A8 associated cancer or that bind specifically in a crude blood sample (optionally a crude 
serum or plasma sample). 

One characteristic that influences the specificity of an antibodyiantigen interaction is the 
afiBnity of the antibody for the antigen. Although the desired specificity may be reached with a 
range of different affinities, generaUy prefenred antibodies will have an affinity (a dissociation 
f about 10"^, 10"^ 10"^, 10'^ or less. 
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In addition, the techniques used to screen antibodies in order to identify a desirable 
antibody may influence the properties of the antibody obtained. For example, an antibody to be 
used for certain therapeutic pmposes will preferably be able to target a particular cell type. 
Accordingly, to obtain antibodies of this type, it may be desirable to screen for antibodies that 
bind to cells that express the antigen of interest (e.g., by fluorescence activated cell sorting). 
Likewise, if an antibody is to be used for binding an antigen in solution, it may be desirable to 
test solution binding. A variety of different techniques are available for testing interaction 
between antibodies and antigens to identify particularly desirable antibodies. Such techniques 
include EUSAs, surface plasmon resonance binding assays (e.g., the Biacore binding assay, 
Bia-core AB, Uppsala, Sweden), sandwich assays (e.g., the paramagnetic bead system of IGEN 
International, Inc., Gaithersburg, Maryland), western blots, immunoprecipitation assays, and 
immunohistochemistry. 

In certain embodiment, antibodies of the invention may be useful as diagnostic or 
therapeutic agents for detecting or treating SLC5A8-associa1ed diseases (e.g., cancers). The 
diagnostic method comprises the steps of contacting a sample of test cells or a protein extract 
thereof with immunospecific anti-SLC5A8 antibodies and assaying for the formation of a 
complex between the antibodies and a protem in the sample. Formation of low levels of 
complex in the test cell as compared to the normal cells indicates that the test cell is cancerous. 

Vn. Trans genic Animals 

Another aspect of the invention features transgenic non-human animals which express a 
heterologous SLC5A8 gene, e.g., having a sequence of SEQ ID NO: 3 or 4, or fragments 
thereof In another aspect, the invention features transgenic non-human animals which have had 
one or both copies of the endogenous SLC5A8 genes disrupted in at least one of the tissue or 
cell-types of the animal. In one embodiment, the transgenic non-human animals is a mammal 
such as a mouse, rat, rabbit, goat, sheep, dog, cat, cow or non-human primate. Without being 
bound to theory, it is proposed that such an animal may display a ph^omenon associated with 
reduced or increased chance of cancer development (e.g., colon cancer, breast cancer, thyroid 
cancer, or stomach cancer). Accordingly, such a transgenic animal may serve as a useful animal 
model to study the progression of cancer diseases. 

The term *transgene" is used herein to describe genetic material that has been or is about 
to be artificially inserted into the genome of a mammalian cell, particularly a mammalian cell of 
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a living animal. The transgene is used to transfonn a cell, meaning that a permanent or transient 
genetic change, preferably a permanent genetic change, is induced in a cell following 
incorporation of exogenous DNA. A permanent genetic change is generally achieved by 
introduction of the DNA into the genome of the cell. Vectors for stable integration include 
plasmids, retroviruses and other animal viruses, YACs, and the like. Of interest are transgenic 
mammals, e.g., cows, pigs, goats, horses, etc., and particularly rodents, e.g., rats, mice, etc. 
Preferably, the transgenic-animals are mice. 

Transgenic animals comprise an exogenous nucleic acid sequence present as an 
extrachromosomal element or stably integrated in all or a portion of its cells, especially in germ 
cells. Unless otherwise indicated, it will be assumed that a transgenic animal comprises stable 
changes to the germline sequence. During the initial construction of the animal, "chimeras" or 
"chimeric animals" are generated, in which only a subset of cells have the altered genome. 
Chimeras are primarily used for breeding purposes in order to generate the desired transgenic 
animal. Animals having a heterozygous alteration are generated by breeding of chimeras. Male 
and female heterozygotes are typically bred to generate homozygous animals. 

The exogenous g;ene is usually either ftom a different species than the animal host, or is 
otherwise altered in its coding or non-coding sequence. The introduced gene may be a wild-type 
gene, naturally occurring polymorphism, or a genetically manipulated sequence, for example 
having deletions, substitutions or insertions in the coding or non-coding regions. Where the 
introduced gene is a coding sequaice, it is usually operably linked to a promoter, which may be 
constitutive or inducible, and other regulatory sequences required for expression in the host 

arrimal . 

In one aspect of the invention, a SLC5A8 transgene can encode the wild-type form of the 
protein, homologs thereof, as well as antisense constructs. A SLC5A8 transgene can also 
encode a soluble form of SLC5A8 that has tumor suppressor activity or sodium solute 
transporter activity. 

It may be desu-able to express the heterologous SLC5A8 transgene conditionally such 
that either the tiTnitig or the level of SLC5A8 gene expression can be regulated. Such 
conditional expression can be provided using prokaryotic promoter sequences which require 
prokaryotic proteins to be simultaneous expressed in order to facilitate expression of the 
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SLC5A8 transgene. Exemplary promoters and the corresponding trans-activating prokaryotic 
proteins are given in U.S. Patent No. 4,833,080. 

Moreover, transgenic animals exhibiting tissue specific expression can be generated, for 
example, by inserting a tissue specific regulatory element, such as an enhancer, into the 
transgene. For example, the endogenous SLC5A8 gene promoter or a portion thereof can be 
replaced with another promoter and/or enhancer, e.g., a CMV or a Moloney murine leukemia 
virus (MLV) pronaoter and/or enhancer. 

Transgenic animals containing an inducible SLC5 A8 transgene can be generated using 
inducible regulatory elements (e.g., metallothionein promoter), which are well-known in tiie art. 
SLC5A8 transgene expression can then be initiated in these animals by administering to the 
animal a compound which induces gene expression (e.g., heavy metals). Another preferred 
inducible system comprises a tetracycline-inducible transariptional activator (U.S. Patent Nos. 
5,654,168 and 5,650,298). 

The present invention provides transgenic animals that cany the transgene in all their 
cells, as well as animals that carry the transgene in some, but not all cells, i.e., mosaic animals. 
The transgene can be integrated as a single transgene or in tandem, e.g., head to head tandems, 
or head to tail or tail to tail or as multiple copies. 

The successful expression of the transgene can be detected by any of several means well 
known to those skilled in the art. Non-limiting examples include Northem blot, in situ 
hybridization of rnRNA analysis, Westem blot analysis, imraunohistochemistry, and FACS 
analysis of protein expression. 

In a further aspect, the invention features non-human animal cells containing a SLC5 A8 
transgene, preferentially a human SLC5A8 transgene. For example, the animal cell (e.g., 
somatic cell or germ cell (i.e., egg or sperm)) can be obtained from the transgenic animal. 
Transgenic somatic cells or cell lines can be used, for example, in drug screening assays. 
Transgenic germ cells, on the other hand, can be used in generatmg transgenic progeny. 

Although not necessary to the operability of the invention, the transgenic animals 
described herein may comprise alterations to endogenous genes in addition to, or alternatively, 
to the genetic alterations described above. For example, the host animals may be either 
•Tmockouts" or 'lotiockins" for the SLC5 A8 gene. Knockouts have a partial or complete loss of 
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function in one or both alleles of an endogenous gene of interest Knockins have an introduced 
transgene with altered genetic sequence and/or function from the endogenous gene. The two 
may be combined, for example, such that the naturally occurring gene is disabled, and an altered 
form introduced For example, it may be desirable to knockout the host animal's endogenous 
SLC5A8 gene, while introducing an exogenous SLC5A8 gene (e.g., a human SLC5A8 gene). 

In a knockout, preferably the target gene expression is undetectable or insignificant. For 
example, a knock-out of a SLC5A8 gene means that function of the SLC5A8 has been 
substantially decreased so that expression is not detectable or only present at insignificant levels. 
This may be achieved by a variety of mechanisms, including introduction of a disruption of the 
coding sequence, e.g., insertion of one or more stop codons, insertion of a DNA fragment, 
deletion of coding sequence, substitution of stop codons for coding sequence, etc. In some 
cases, the exogenous transgene sequences are ultimately deleted from the genome, leaving a net 
change to the native sequence. Different approaches may be used to achieve the 'Tmock-out" A 
chromosomal deletion of all orpart of the native gene maybe induced, including deletions of the 
non-coding regions, particularly the promoter region, 3* regulatory sequences, enhancers, or 
deletions of gene that activate expression of APP genes. A functional knock-out may also be 
achieved by the mtroduction of an anti-sense construct that blocks expression of tiie native genes 
(for example, see li and Cohen (1996) Cell 85:319-329). *TKnock-out5" also include conditional 
knock-outs, for example, where alteration of the target gene occurs upon exposure of the animal 
to a substance that promotes target gene alteration, introduction of an enzyme that promotes 
recombination at the target gene site (e.g., Cre in the Cre-lox system), or other method for 
directing the target gene alteration postnatally. 

A *'knockin" of a target gene means an alteration in a host cell genome that results in 
altered expression or function of a native target gene. Increased (including ectopic) or decreased 
expression may be achieved by introduction of an additional copy of the target gene, or by 
opeiatively inserting a regulatory sequence that provides for enhanced expression of an 
endogenous c opy oft he t arget gene. These c hanges m ay b e c onstitutive o r conditional, i .e., 
dependmt on the presence of an activator or repressor. The use of knockin technology may be 
combined with production of exogenous sequences to produce the transgenic animals of tiie 
invention. 

DNA constracts for random integration need not include regions of homology to mediate 
•^u;*,^*!^^ Where homologous recombination is desired, the DNA constructs will comprise 
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at least a portion of the target gene with the desired genetic modiification, and will include 
regions of homology to the target locus. Conveniently, markers for positive and negative 
selection are included. Methods for generating cells having targeted gene modifications through 
homologous recombination are known in the art. For various techniques for transfectmg 
mammalian cells, see Keown et al. (1990) Metiiods in Enzymology 185:527-537. 

For embryonic stem (ES) cells, an ES cell line may be employed, or embryonic cells may 
be obtained freshly from a host, e.g., mouse, rat, or guinea pig. Such cells are grown on an 
appropriate fibroblast-feeder layer or grown in the presence of appropriate growth factors, such 
as leukemia inhibiting factor (LIF). When ES cells have been transformed, they may be used to 
produce transgenic animals. After transformation, the cells are plated onto a feeder layer in an 
appropriate medium. Cells containing the construct may be detected by employing a selective 
medium. After sufficient time for colonies to grow, they are picked and analyzed for the 
occurrence of homologous recombination or integration of the construct. Those colonies that are 
positive may then be used for embryo manipulation and blastocyst injection. Blastocysts are 
obtained from 4 to 6 week old superovulated females. The ES cells are trypsinized, and the 
modified cells are injected into the blastocoel of the blastocyst. After injection, the blastocysts 
are returned to each uterine horn of pseudopregnant females. Females are then allowed to go to 
term and the resulting litters screened for mutant cells having the construct. By providing for a 
different phenotype of the blastocyst and the ES cells, chimeric progeny can be readily detected. 

The chimeric animals are screened for the presence of the modified gene and males and 
females having the modification are mated to produce homozygous progeny. If the gene 
alterations cause lethality at some point in development, tissues or organs can be maintained as 
allogeneic or congenic grafts or transplants, or in in vitro culture. 

The transgenic animals of the present invention may be an animal model for a SLC5A8- 
associated disease (e.g., cancer), and display cancer-related phenotypes (e.g., colon cancer, 
breast cancer, thyroid cancer, or stomach cancer), depending on different alleles generated. 
Accordingly, such transgenic animals can be used in in vivo assays to identify cancer 
thera^seutics. In an exemplary embodiment, the assay comprises administering a test compound 
to a transgenic animal of the invention, and comparing a phenotypic change in cancer 
development in the animal relative to a transgenic animal which has not received the test 
conipound. 
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To illustrate, the transgenic animals and cell lines are particularly useful in screening 
compounds that have potential as prophylactic or therapeutic treatments of diseases such as may 
involve aberrant expression, or loss, of the SLC5A8 gene. Screening for a useful drug would 
involve administering the candidate drug over a range of doses to the transgenic animal, and 
assaying at various time points for the efiG5Ct(s) ofthe drug on the disease or disorder being 
evaliiated. Alternatively, or additionally, the drug could be administered prior to or 
simultaneously with exposure to induction ofthe disease, if ^pHcable. 

In one embodiment, candidate compounds are screened by being administered to the 
transgenic animal, over a range of doses, and evaluating the animal*s physiological response to 
the compound(s) over time. Administration may be oral, or by suitable injection, dq)ending on 
the chemical nature ofthe compound being evaluated. In some cases, it may be appropriate to 
administer the compound in conjimction with co-factors that would enhance the efficacy ofthe 
compoxmd 

In screening cell lines derived from the subject transgenic animals for compounds useful 
in treating various disorders, the test compound is added to the cell cxilture medium at the 
appropriate time, and the cellular response to the confound is evaluated over time using the 
appropriate biochemical and/or histological assays. In some cases, it may be appropriate to 
apply the compound of interest to the culture medium in conjunction with co-factors that would 
enhance the efficacy of the compound. 

In another aspect, the animals of this invention can be used as a source of cells, 
differentiated or precursor, which can be immortalized in cell culture. Cells in which the normal 
function of the SLC5A8 protein is altered by a transgene may be isolated firom potentially any 
tissue of the animal, as well as fonn animals at any developmental stage, e.g. embryonic to 
adult. The subject transgenic animals can, accordingly, be used as a soiurce of material for the 
growth, identification, purification and detailed analysis of, inter aha, precursor cells, including 
stem cells and pluripotent progenitor cells for a variety of tissues. 

Vectors used for transforming animal embryos are constructed using methods well 
known in the art, including, without Umitation, the standard techniques of restriction 
endonuclease digestion, ligation, plasmid and DNA and RNA purification, DNA sequencing, 
and the like as described, for example in Sambrook, Fritsch, and Maniatis, eds.. Molecular 
Cloning: A Laboratory Manual., (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
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N.Y. >1989!). Most practitioners are familiar with the standard resource materials as well as 
specific conditions and procedures. 

Vm. Screening Assays 

Hie invention provides methods (also refeired to herein as "screening assays") for 
identi^ng modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, 
peptidomimetics, peptoids, smaU molecules or other drugs) which bind to SLC5A8 proteins, 
have a stimulatory or inhibitory effect on, for example, SLC5A8 expression or SLC5A8 activity, 
or have a stimulatory or inhibitory effect on, for example, tiie expression or activity of a 
SLC5A8 substrate. Compounds flius identified can be used to modulate tiie activity of target 
gene products (e.g., tiie SLC5A8 gene) in a tiierapeutic protocol, to elaborate tiie biological 
function of the target gene product, or to identify compounds fliat disrapt normal target gene 
interactions. Given fliat tiie SLC5A8 polypeptide is a transmembrane protem, agents tiiat bmd 
to a SLC5A8 polypeptide may include its nafairal ligands, downstiream signaling molecules, and 
other endogenous polypeptides as well as artificial compounds. In one embodiment, an assay 
detects agents which inhibit interaction of flie subject SLC5A8 polypeptides witii a SLC5A8- 
assodated protein. A wide variety of assays may be used for tiiis purpose, including labeled in 
vitro protein-protein binding assays, interaction trap assay, immunoassays for protein binding, 
and tiie like. 

Given the role of SLC5A8 in transporting sodium solute and in cancer development, tiie 
agents tiiat bind to SLC5A8 as weU as the agents tiiat interfere witii SLC5A8 binding to 
SLC5A8-associated proteins may be able to modulate transporting sodium solute or cancer 
development. Accordingly, one aspect of tiie invention provides a method for assessing the 
ability of an agent to modulate transporting sodium solute or cancer development, comprising: 
1) combining: a first polypeptide including at least a portion of a SLC5A8 polypeptide, a second 
polypeptide including at least a portion of a SLC5A8-associated protein tiiat interacts witii tiie 
first polypeptide, and an agent, under conditions wherein the first polypeptide interacts witii tiie 
second polypeptide in the absence of said agent, 2) determining if said agent interferes witii tiie 
interaction, and 3) for an agent fliat interferes witii tiie interaction, ftirther assessing its ability to 
interfere witii SLC5A8's ability to transport sodium solute or suppress tumor development. 

In one embodiment an activity (e.g., tiie sodium solute transporting activity) of a 
SLC5A8 protein can be assayed as follows. Xaiopus laevis oocytes are injected wifli mRNA 
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encoding the SLC5A8 protein or a eukaryotic expression vector able to express such an mRNA, 
using a Diunmiond Nanoject (Dnimmond Scientific, Broomall, Pa. into the animal pole of 
defoUiculated oocytes as described by Swick et al. ((1992) Proc. Natl. Acad. Sci. USA. 89:1812- 
1816). The injected oocytes are then transferred to microtiter wells about 12 to 24 hours prior to 
being assayed. The transporter function of oocyte-expressed SLC5A8 polypeptide is assessed 
by sodium uptakes as described (see e.g., Romera et al. (2000) J. Biol. Chem. 275:24552-24559; 
Sciortino et al. (1999) Am. J. PhysioL 277:F61 1-623). 

A variety of assay formats will suffice and, in light of the present disclosure, those not 
expressly described herein will nevertheless be comprehended by one of ordinary skill in the art. 
Assay formats which approximate such conditions as formation of protein complexes, enzymatic 
activity, may be generated in many different forms, and include assays based on cell-free 
systems, e.g., purified proteins or cell lysates, as well as cell-based assays which utilize intact 
cells. Simple binding assays can also be used to detect agents which bind to SLC5A8. Such 
binding assays may also identify agents that act by disrupting the interaction between a SLC5A8 
polypeptide and a SLC5A8 interacting protein. Agents to be tested can be produced, for 
example, by bacteria, yeast or other organisms (e.g., natural products), produced chemically 
(e.g., small molecules, including peptidomimetics), or produced recombinantly. In a preferred 
embodiment, flie test agent is a small organic molecule, e.g., other than a peptide or 
oligonucleotide, having a molecular weigjht of less than about 2,000 daltons. 

In many drug screening programs which test libraries of compoimds and natural extracts, 
high throughput assays are desirable in order to maximize the number of compoimds surveyed in 
a given period of time. Assays of the present invention which are performed in cell-fi-ee 
systems, sixch as may be developed with purified or semi-purified proteins or with lysates, are 
often prefrared as "primary" screens in that they can be generated to permit rapid development 
and relatively easy detection of an alteration in a molecular target which is mediated by a test 
compound. Moreover, the effects of cellular toxicity and/or bioavailability of the test compound 
can be generally ignored in the in vitro system, the assay instead being focused primarily on the 
effect of the drug on the molecular target as may be manifest in an alteration of binding aflSnity 
with other proteins or changes in enzymatic properties of the molecular target 

In prefened in vitro embodiments of the present assay, a reconstituted SLC5A8 complex 
comprises a reconstituted mixture of at least semi-purified proteins. By semi-purified, it is 
tnftsmt tTi»t the proteins utilized in the reconstituted mixture have been previously separated fiom 
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Other cellular or viral proteins. For instance, in contrast to cell lysates, the proteins involved in 
SLC5A8 complex formation are present in the mixture to at least 50% purity relative to all other 
proteins in the mixture, and more preferably are present at 90-95% purity. In certain 
embodiments of the subject method, the reconstituted protein mixture is derived by mixing 
highly purified proteins such that the reconstituted mixture substantially lacks other proteins 
(such as of cellular or viral origin) which might interfere with or otherwise alter the ability to 
measure SLC5A8 complex assembly and/or disassembly. 

Assaying SLC5 A8 complexes, in the presence and absence of a candidate agent, can be 
accompUshed in any vessel suitable for containing the reactants. Examples include microtitre 
plates, test tubes, and micro-centrifuge tubes. In a screening assay, the effect of a test agent may 
be assessed by, for example, assessing the effect of the test agent on kinetics, steady-state and/or 
endpoint of the reaction. 

In one embodiment of the present invention, drug screening assays can be generated 
which detect inhibitory agents on the basis of their ability to interfere with assembly or stabiUty 
of the SLC5A8 complex. In m exemplary bmding assay, the compound of interest is contacted 
with a mixture comprising a SLC5A8 polypeptide and at least one interacting polypeptide. 
Detection and quantification of SLC5A8 complexes provides a means for determining the 
conq>ound's efficacy at inhibitmg (or potentiating) interaction between the two polypeptides. 
The efficacy of the compound can be assessed by generating dose response curves &om data 
obtained using various concentrations of the test compound. Moreover, a control assay can also 
be performed to provide a baseline for comparison, hi the control assay, the formation of 
complexes is quantitated in the absence of the test compound. 

Complex formation between the SLC5A8 polypeptides and a substrate polypeptide may 
be detected by a variety of techniques. For instance, modulation in the formation of complexes 
can be quantitated using, for example, detectably labeled proteins (e.g., radiolabeled, 
fluorescently labeled, or enzymaticaUy labeled), by immunoassay, or by chromatographic 
detection. Surface plasmon resonance systems, such as those available from Biacore 
International AB (Uppsala, Sweden), may also be used to detect protein-protein interaction. 

Often, it will be desirable to immobilize one of the polypeptides to facilitate separation 
of complexes fi-om uncomplexed forms of one of the proteins, as well as to accommodate 
automation of the assay. In an illustrative embodiment, a fusion protein can be provided which 
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adds a domain that permits the protein to be bound to an insoluble matrix. For example, GST- 
SLC5A8 fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St 
Louis, MO) or glutathione derivatized microtitre plates, which are then combined with a 
potential interacting protein, e.g., an ^^S-labeled polypeptide, and the test compound and 
incubated under conditions conducive to complex formation . Following incubation, the beads 
are washed to remove any unbound interacting protein, and the matrix bead-bound radiolabel 
deteraiined directly (e.g., beads placed in scintillant), or in the supernatant after the complexes 
are dissociated, e.g., when microtitre plate is used Altematively, after washing away unbound 
protein, tiiie complexes can be dissociated from the matrix, separated by SDS-PAGE gel, and the 
level of interacting polypeptide found in the matrix-bound fraction quantitated from the gel 
using standard electrophoretic techniques. 

In a further embodiment, agents that bind to a SLC5A8 may be identified by using an 
immobilized SLC5A8. In an illustrative embodiment, a fusion protein can be provided which 
adds a domain that permits the protein to be bound to an insoluble matrix. For example, GST- 
SLC5A8 fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. 
Louis, MO) or glutathione derivatized microtitre plates, which are then combined with a 
potential labeled binding agent and incubated under conditions conducive to binding. Following 
incubation, the beads are washed to remove any unbound agent, and the matrix bead-bound label 
determined directly, or in the supernatant after the bound agent is dissociated. 

In yet another embodiment, the SLC5A8 polypeptide and potential interacting 
polypeptide can be used to generate an interaction trap assay (see also, U.S. Patent No. 
5,283,317; Zervos ct al. (1993) Cell 72:223-232; Madura et aL (1993) J Biol Chem 268:12046- 
12054; Bartel et al. (1993) Biotechniques 14:920-924; andlwabuchi et al. (1993) Oncogene 
8:1693-1696), for subsequently detecting agents which disrupt binding of the protems to one and 
other. 

One aspect of the present invention provides reconstituted protein preparations including 
a SLC5A8 polypeptide and one or more interacting polypeptides. 

In still further embodiments of the present assay, the SLC5A8 complex is generated in 
whole cells, taking advantage of cell culture techniques to support the subject assay. For 
exanq)le, as described below, the SLC5A8 complex can be constituted in a eukaryotic cell 
culture system, including mammalian and yeast cells. Advantages to generating the subject 
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assay in an intact cell include the ability to detect inhibitors which are tunctionai in an 
environment more closely approximating that which therapeutic use of the inhibitor would 
requke, including the ability of the agent to gain entry into the cell. Furthermore, certain of the 
m vivo embodiments of tiie assay, such as examples given below, are amenable to high through- 
put analysis of candidate agents. 

The components of the SLC5A8 complex can be endogenous to the cell selected to 
support the assay. Alternatively, some or all of the components can be derived from exogenous 
sources. For instance, fusion proteins can be introduced into the cell by recombmant techniques 
(such as through the use of an expression vector), as well as by microinjecting the fusion protein 
itself or mRNA encoding the fusion protein. 

In many embodiments, a cell is manipulated after incubation with a candidate agent and 
assayed for a SLC5A8 activity, hi certain embodhnents a SLC5A8 activity is represented by 
sodium transporting activity or tumor suppressing activity. In certain embodiments, SLC5A8 
activities may also include, without limitation, complex formation between SLC5A8 and its 
associated proteins. SLC5A8 complex formation may be assessed by immunoprecipitation and 
analysis of co-immunoprecipiated proteins or affinity purification and analysis of co-purified 
proteins. Fluorescence Resonance Energy Transfer (FRET)-based assays may also be used to 
detennine complex formation. Fluorescent molecules havmg the proper emission and excitation 
spectra that are brought into close proximity with one anothea: can exhibit FRET. The 
fluorescent molecules are chosen such that the emission spectrum of one of the molecules (the 
donor molecule) overlaps with the excitation spectrum of the other molecule (the acceptor 
molecule). The donor molecule is excited by hght of appropriate intensity within the donor's 
excitation spectrum. The donor then emits the absorbed energy as fluorescent light. The 
fluorescent energy it produces is quenched by the acceptor molecule. FRET can be manifested 
as a reduction in the intensity of the fluorescent signal from the donor, reduction in the lifetime 
of its excited state, and/or re-emission of fluorescent light at the longer wavelengths (lower 
energies) characteristic of the acceptor. When the fluorescent proteins physically separate, 
FRET effects are diminished or eliminated. (U.S. Patent No. 5,981,200). 

In general, where the screening assay is a binding assay (whether protein-protein 
binding, agent-protein binding, etc.), one or more of the molecules maybe joined to a label, 
where the label can directly or mdirectly provide a detectable signal. Various labels include 
-^j,«;«^+^^es, fluorescers, chemiluminescers, enzymes, specific binding molecules, particles. 



-53- 



wo 03/104427 



PCTAJS03/18239 



e.g.> magnetic particles, and the like. Specific binding molecules include pairs, such as biotin 
and streptavidin, digoxin and antidigoxin etc. For the specific binding members, the 
complementary member would nomially be labeled with a molecule that provides for detection, 
in accordance with known procedures. 

A variety of other reagents may be included in tiie screening assay. These include 
reagents like salts and neutral proteins (e.g., albumin, detergents, etc) that are used to facilitate 
optimal protein-protein bindiag and/or reduce nonspecific or background interactions. Reagents 
that improve the efficiency of tiie assay, such as protease inhibitors, nuclease inhibitors, anti- 
microbial agents, etc. may be used. The mixture of components are added in any order that 
provides for the requisite binding. Incubations are performed at any suitable temperature, 
typically between 4 ''C and 40 ^C. Incubation periods are selected for optimum activity, but may 
also be optimized to facilitate rspid higjh-throughput screening. 

It is to be understood that the screening assays discussed above are applicable to identify 
therapeutic agents related to soluble SLC5A8 polypeptides and derivatives thereof. An 
exemplary derivative of soluble SLC5A8 polypeptides is a fusion protein containing soluble 
SLC5A8 polypeptide. Given the role of soluble SLC5A8 polypeptides in sodium transporting 
and/or tumor suppression, conQ}ositions that perturb the formation or stability of the protein- 
protein interactions between soluble SLC5A8 polypeptides and the proteins that they interact 
with, are candidate pharmaceuticals for the treatment of SLCSAS-associated diseases such as 
cancer. 

K. Predictive Medicine 

The present invention also pertains to the field of predictive medicine in which 
diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic 
(predictive) purposes to thereby treat an individual. Generally, the invention provides a method 
of determming if a subject is at risk for a disorder related to a lesion in or the misesqiression of a 
gene which encodes SLC5A8, for example cancers (e.g., colon cancer, breast cancer, thyroid 
cancer, or stomach cancer). 

The method includes one or more of the following: 1) detecting, in a tissue of the 
subject, the presence or absence of a mutation which affects the expression of the SLC5A8 gene, 
or detecting the presence or absence of a mutation in a region which controls the expression of 
xi. ^ ^ mutation in the 5' control region; 2) detecting, in a tissue of the subject, the 
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presence or absence of a mutation which alters the structure of the SLC5A8 gene; 3) detecting, 
in a tissue of the subject, the misexpression of the SLC5A8 gene, at the mRNA level, e.g., 
detecting a non-wild type level of a mRNA; 4) detecting, in a tissue of the subject, the 
misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a SLC5A8 
polypeptide; and 5) detecting, in a tissue of the subject, methylation of the SLC5A8 gene in the 
5' SLC5A8 genomic nucleotide sequences (see detailed descriptions in the following section). 

In preferred embodiments, the method may also include ascertaining the existence of at 
least one of: 1) a deletion of one or more nucleotides from the SLC5A8 gene; 2) an insertion of 
one or more nucleotides into the gene; 3) a point mutation, e.g., a substitution of one or more 
nucleotides of the gene; and 4) a gross chromosomal rearrangement of the gene, e.g., a 
translocation, inversion, or deletion. 

For example, detecting the genetic lesion can include: (i) providing a probe/primer 
including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a 
seose or antisense sequence from SEQ ID NO: 3 or 4, or naturally occurring mutants thereof, or 
5' or 3* flanking sequences naturally associated with the SLC5A8 gene; (ii) exposing the 
probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ 
hybridization, of the probe/primer to the nucleic add, the presence or absence of the genetic 
lesion. 

In preferred embodiments, detecting the misexpression includes ascertaining the 
existence of at least one of: an alteration in the level of a messenger RNA transcript of the 
SLC5 A8 gene; the presence of a non-wild type spUcing pattern of a messenger RNA transcript 
of the gene; or a non-wild type level of SLC5A8. 

Methods of the invention can be used prenatally or to determine if a subject's offepring 
will be at risk for a disorder. In preferred embodiments, the method includes determining the 
structure of a SLC5A8 gene, an abnormal structure being indicative of risk for the disorder. 

In preferred embodiments, the method includes contacting a sample from the subject 
with an antibody to the SLC5 A8 protein or a nucleic acid which hybridizes specifically with the 
gene. These and other embodiments are discussed below. 

X. Diamostic and Prognostic Assavs 
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Diagnostic and prognostic assays of the invention include method for assessing the 
e3q)ression level of SLC5A8 molecules and for identifying variations and mutations in the 
sequence of SLC5A8 molecules. In c^tain embodiments, the invention provides methods by 
assaying the SLC5A8 expression leyel so as to determine whether a patient has or does not have 
a disease condition. Further, such a disease condition may be characterized by decreased 
expression of SLC5A8 nucleic acid or protein described herein. In certain embodiments, the 
invention provides methods for determining wheflier a patient is or is not likely to have a 
SLC5A8-associated disease by detecting the esqpression of the SLC5A8 nucleotide sequences. 
In furttier embodiments, the invention provides me&ods for determining whether the patient is 
having a relapse or determining whether a patient's cancer is responding to treatment 

The presence, level, or absence of SLC5 A8 protein or nucleic acid in a biological sample 
can be evaluated by obtaining a biological sample from a test subject and contacting the 
biological sample with a compound or an agent capable of detecting SLC5 A8 protein or nucleic 
acid (e.g., mRNA, genomic DNA) that encodes SLC5A8 protein such ttiat ttie presence of 
SLC5A8 protein or nucleic acid is detected in flie biological sample. The level of expression of 
the SLC5 A8 gene can be measured in a number of ways, including, but not limited to: 
measuring the mRNA encoded by the SLC5A8 genes; measuring the amount of protein encoded 
by the SLC5A8 gene; or measuiing the activity of the protein encoded by the SLC5A8 gene. 
The level of mRNA conresponding to the SLC5A8 gene in a cell can be determined both by in 
situ and by in vitro formats. 

The isolated mRNA can be used in hybridization or amplification assays that include, but 
are not limited to, Southran or Northern analyses, polymerase chain reaction (PGR) analyses and 
probe arrays. One preferred diagnostic method for the detection of mRNA levels involves 
contacting flie isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the 
mRNA encoded by the SLC5A8 gene. The nucleic acid probe can be, for exanq)le, a full-length 
SLC5A8 nucleic acid, such as the nucleic acid of SEQ ID NO: 3 .or 4, or a portion thereoi^ such 
as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and 
sufficient to specifically hybridize under stringent c onditions t o S LC5A8 mRNA or genomic 
DNA. The probe can be disposed on an address of an array, e.g., an array described below. 
Other suitable probes for use in the diagnostic assays are described herein. 

In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the 
— 1 — example, by running the isolated mRNA on an agarose gel and transferring the 
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mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes 
are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for 
example, in a two-dimensional gene chip array described below. A skiUed artisan can adapt 
known mKNA detection methods for use in detecting the level of mRNA encoded by the 
SLC5A8 gene. 

The level of SLC5A8 mRNA in a sample can be evaluated with nucleic acid 
amplification, e.g., by RT-PCR (Mullis (1987) U.S. Pat No. 4,683,202), ligase chain reaction 
(Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence repUcation 
(Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification 
system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta RepUcase 
(Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle repUcation (Lizardi et al., U.S. 
Patent No. 5,854,033) or any other nucleic acid amplification method, foUowed by the detection 
of the ampUfied molecules using techniques known in the art As used herein, ampUfication 
primers are defined as being a pair of nucleic acid molecules that can anneal to 5' or 3* regions 
of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in 
between. In general, amplification primers are fmm about 10 to 30 nucleotides in length and 
flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and 
with appropriate reagents, such primers permit the amplification of a nucleic add molecule 
comprising the nucleotide sequence flanked by the primers. 

For in situ methods, a cell or tissue sample can be prepared/processed and immobilized 
on a support, typically a glass sKde, and then contacted with a probe that can hybridize to 
mRNA that encodes the SLC5A8 gene being analyzed. 

In another embodiment, the methods fiirther contacting a control sample with a 
compound or agent capable of detecting SLC5A8 mRNA, or genomic DNA, and comparing the 
presence of SLC5A8 mRNA or genomic DNA in the control sample with the presence of 
SLC5A8 mRNA or genomic DNA in the test sample. 

A variety of methods can be used to determine the level of protein encoded by SLC5A8. 
In general, these methods include contacting an agent that selectively binds to the protein, such 
as an antibody with asample,toevaluatethelevelofproteininthesample. Ina preferred 
embodimait, the antibody bears a detectable label. Antibodies can be polyclonal, or more 
preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab')2) can be 
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used. The term "labeled " with regard to the probe or antibody, is intended to encompass direct 
labeling of the piobe or antibody by coupling (i.e., physically linking) a detectable substance to 
the probe or antibody, as weU as indirect labeling of the probe or antibody by reactivity with a 
detectable substance. Examples of detectable, substances are provided herein. 

The detection methods can be used to detect SLC5A8 protein in a biological sample in 
vitro as well as in vivo. In vitro techniques for detection of SLC5A8 protein include enzyme 
linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, CDzymc 
immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques 
for detection of SLC5A8 protein include introducing into a subject a labeled anti-SLC5A8 
antibody. For example, the antibody can be labeled with a radioactive marker whose presence 
and location in a subject can be detected by standard imaging techniques. In another 
embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an 
anti-SLC5A8 antibody positioned on an antibody array (as described below). The sample can be 
detected, e.g., with avidin coupled to a fluorescent label. 

In another embodiment, the methods further mclude contacting the control sample with a 
compound or agent capable of detecting SLC5A8 protein, and comparing flhie presence of 
SLC5 A8 protein in the control sample with the presence of SLC5 A8 protein in the test sample. 

The invention also mcludes kits for detecting the presence of SLC5A8 in a biological 
sample. For example, the kit can include a compoimd or agent capable of detecting SLC5A8 
protein or mRNA in a biological sample; and a standard. The compound or agent can be 
packaged in a suitable container. The kit can further comprise instructions for using the kit to 
detect SLC5A8 protein-or nucleic acid. 

For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid 
support) which binds to a polypeptide corresponding to a marker of the invmtion; and, 
optionally, (2) a second, difBsrent antibody whichbinds to either the polypeptide orthe first 
antibody and is conjugated to a detectable agent. 

For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a 
. detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a 
polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for 
amplifying a nucleic acid molecule conresponding to a marker of the invention. The kit can also 
" • ' buflFering agent, a preservative, or a protein stabilizing agent. The kit can also 
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includes conqionents necessary for detecting the detectable agent (e.g., an enzyme or a 
substrate). The kit can also contain a conliol sample or a series of control samples which can be 
assayed and conq)ared to the test sample contained. Each component of the kit can be enclosed 
within an individual container and all of the various containers can be within a single package, 
along with mstructions &a interpreting the results of the assays performed using the kit 

The diagnostic methods described herein can identify subjects having, or at risk of 
developing, a disease or disorder associated with misexpressed or aberrant or unwanted SLC5A8 
expression or activity. As used herein, the term "unwanted" includes an unwanted phenomenon 
involved in a biological response such as pain or deregulated cell proliferation. 

In one embodiment, a disease or disorder associated with aberrant or unwanted SLC5A8 
expression or activity is identified. A test sample is obtained 6om a subject and SLC5A8 
protein or nucleic acid (e.g., mKNA or genomic DNA) is evaluated, wherein the level, e.g., the 
presence or absence, of SLC5A8 protein or nucleic acid is diagnostic for a subject having or at 
risk of developing a disease or disorder associated wifli abenant or unwanted SLC5A8 
expression or activity. 

The prognostic assays described herein can be used to determine whether a subject can 
be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic 
acid, small molecule, or other drug candidate) to treat a disease or disorder associated with 
aberrant or unwanted SLC5A8 expression or activity. For example, such methods can be used 
to determine whether a subject can be effectively treated with an agent for a pain or solute 
transport disorder. 

In yet anotiier aspect, the invention features a method of evaluating a test compound (see 
also, "Screening Assays", above). The method includes providing a cell and a test conq)ound; 
contacting the test compound to the ceU; obtaining a subject expression profile for the contacted 
ceU; and comparing the subject expression profile to one or more reference profiles. The 
profiles include a value representing the level of SLC5A8 expression. In a preferred 
anbodiment, the subject expression profile is compared to a target profile, e.g., a profile for a 
nonnal cell or for desired condition of a cell. The test compound is evaluated favorably if the 
subject expression profile is more similar to the target profile than an expression profile obtained 
fiom an uncontacted cell. 

- - thnAs of AssavineMethvlatiftTi of .ST^ISA S Nucleotides 
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In certain aspects, the invention provides assays and« methods using the SLC5A8 
nucleotide sequences as molecular markers that distinguish between healthy cells and SLC5A8- 
associated diseased cells (cells of colon cancer, breast cancer, thyroid cancer or stomach cancer). 
In one aspect, a molecular marker of the invention is a differentially methylated SLC5A8 
nucleotide sequence. . 

Accordingly, in certain embodiments, the invention provides assays for detecting 
differentially methylated SLC5A8 nucleotide sequences, such as the differential methylation 
patterns in nucleic acid sequence of SEQ ID NO: 12, 13 or 14. Thus, a differentially methylated 
SLC5A8 nucleotide sequence, in its methylated state, can be a SLC5A8-associated cancer- 
specific modification that serves as a target for detection using various methods described herein 
and the methods that are well within the purview of tiie skilled artisan in view of tihe teachings 
of this application. 

In certain aspects, such methods &x detecting methylated SLC5 A8 nucleotide sequences 
are based on treatment of SLC5A8 genomic DNA with a chemical compound which converts 
non-methylated C, but not methylated C (i.e., 5mC), to a different nucleotide base. One such 
compound is sodium bisulfite, which converts C, but not 5mC, to U. Methods for bisulfite 
treatment of DNA are known in the art (Herman, et al., 1996, Proc Natl Acad Sci USA, 
93:9821-6; Hennan and Baylin, 1998, Current Protocols in Human Genetics, N. E. A. 
Dracopoli, ed., John Wiley & Sons, 2:10.6.1-10.6.10; U.S. Patent No. 5,786,146). To illustrate, 
when an DNA molecule that contains unmethylated C nucleotides is treated with sodium 
bisulfite to become a compound-converted DNA, the sequence of that DNA is changed (C->U). 
Detection of the U in the converted nucleotide sequence is indicative of an unmethylated C. 

The different nucleotide base (e.g., U) present in compoimd-converted nucleotide 
sequences can subsequentiy be detected in a variety of ways. In a preferred embodiment, the 
present invention provides a method of detecting U in compound-converted SLC5A8 DNA 
sequences by using "methylation sensitive PCR" (MSP) (see, e.g., Herman, et al., 1996, Proc, 
Natl. Acad. ScL USA, 93:9821-9826; U.S. Patent Nos. 6,265,171; 6,017,704; and 6,200,756). In 
MSP, one set of primers (i.e., comprising a forward and a reverse primer) ampUfies the 
compound-converted t emplate s equence i f C b ases i n C pG d inucleotides w ithin t he S LC5 AS 
DNA are methylated. This set of primers is called '^ethylation-specific primers." Another set 
of primers amphfies the compound-converted template sequence if C bases in CpG 
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dinucleotides within the SLC5A8 5' flanking sequence are not methylated. This set of primers 
is called "unmethyladon-specific piimers." 

In MS-PCR, the reactions use the compound-converted DNA from a sample in a subject. 
In assays for SLC5A8 methylated DNA, methylation-specific primers are used. In the case 
where C within CpG dinucleotides of the target sequence of the DNA are methylated, the 
methylation-specific primers will ampUfy the compound-converted template sequence in the 
presence of a polymerase and an MSP product will be produced. If C within CpG dinucleotides 
of the target sequence of the DNA are not methylated, the methylation-specific primers wiU not 
amplify the compound-converted template sequence in the presence of a polymerase and an 
MSP product will not be produced 

It is often also usefiil to run a control reaction for the detection of uranefhylated SLC5A8 
DNA. The reactions uses the compound-converted DNA from a sample in a subject and 
umnethylation-specific primers are used. In the case where C within CpG dinucleotides of the 
target sequence of the DNA are unmethylated, the unmethylation specific primers wUl amplify 
the compound-converted template sequence in the presence of a polymerase and an MSP 
product will be produced. If C within CpG dinucleotides of the target sequence of the DNA are 
methylated, the unmethylation-specific primers will not amplify the compound-converted 
template sequence in the presence of a polymerase and an MSP product will not be produced. 
Note that a biologic sample wiU often contain a mixture of both neoplastic cells that give rise to 
a signal with methylation specific primers, and normal cellular elements that give rise to a signal 
with unmethylation-specific primers. The unmethyl specific signal is often of use as a control 
reaction, but does not in this mstance imply the absence of cancer (e.g., colon cancer, breast 
cancer, thyroid cancer, or stomach cancer) as indicated by the positive signal derived torn 
reactions using the methylation specific primers. 

Primers for an MSP reaction are derived from the compound-converted SLC5A8 
template sequence. Herein, "derived from" means that the sequences of the primers are chosen 
such that the primors amphfy the compound-converted template sequence in an MSP reaction. 
Each primer comprises a single-stranded DNA fragment which is at least 8 nucleotides in length. 
Preferably, the primers are less than 50 nucleotides in length, more preferably from 15 to 35 
nucleotides in length. Because the compound-converted SLC5A8 template sequence can be 
either the Watson strand or the Crick strand of the double-stranded DNA that is treated with 
j: — '-•"ulfite, the sequences of the primers is dependent upon whether the Watson or Crick 
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compoimd-converted template sequence is chosen to be amplified in the MSP. Either the 
Watson or CMck strand can be chosen to be amplified. 

The compound-converted SLC5A8 template sequence, and therefore the product of the 
MSP reaction, can be between 20 to 3000 nucleotides in length, preferably between 50 to 500 
nucleotides in length, more preferably between 80 to 150 nucleotides in length. Preferably, the 
methylation-specific primers result in an MSP product of a different length than the MSP 
product produced by the unmethylation-specific primers. 

A variety of methods can be used to determine if an MSP product has been produced in a 
reaction assay. One way to determine if an MSP product has been produced in the reaction is to 
analyze a portion of the reaction by agarose gel electrophoresis. For example, a horizontal 
a^se gel of from 0.6 to 2.0% agarose is made and a portion of the MSP reaction mixture is 
electrophoresed through the agarose gel. After electrophoresis, the agarose gel is stained with 
ethidium bromide. MSP products are visible when the gel is viewed during illumination with 
ultraviolet light. By comparison to standardized size markers, it is detennined if the MSP 
product is of the correct expected size. 

Other methods can be used to determine whether a product is made in an MSP reaction. 
One such method is called "real-time PGR." Real-time PGR utilizes a thermal cycler (i.e., an 
instrument that provides the temperature changes necessary for the PGR reaction to occur) that 
incorporates a fluorimeter (i.e. an instrument that measures fluorescence). The real-time PGR 
. reaction mixture also contains a reagent whose incorporation into a product can be quantified 
and whose quantification is indicative of copy number of that sequence in the template. One 
such reagent is a fluorescent dye, called SYBR Green I (Molecular Probes, Inc.; Eugene, 
Oregon) that preferentially binds double-stranded DNA and whose fluorescence is greatly 
enhanced by binding of double-stranded DNA. When a PGR reaction is performed in the 
presence of SYBR Green I, resulting DNA products bind SYBR Green I and fluorescence. The 
fluorescence is detected and quantified by the fluorimeter. Such technique is particularly usefiil 
for quantification of the amount of the product in the PGR reaction. Additionally, the product 
from the PGR reaction may be quantitated in ^'real-time PGR" by the use of a variety of probes 
that hybridize to the product including TaqMan probes and molecular b eacons. Q uantitation 
may be on an absolute basis, or may be relative to a constitutively methylated DNA standard, or 
may be relative to an unmethylated DNA standard. In one instance the ratio of methylated 
nrr^cKo -^-jnved product to unmethylated derived SLG5 A8 product may be constructed. 
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Methods for detecting mettiylation of the SLC5A8 DNA in this invention are not limited 
to MSP, and may cover any assay for detecting DNA methjdation. Another example method for 
detecting methylation of the SLC5A8 DNA is by using •'methylation-sensitive'' resiiiction 
endonucleases. Such methods comprise treating the genomic DNA isolated from a subject with 
an mettiylation-sensitive restriction aidonuclease and then using the restriction endonuclease- 
treated DNA as a template in a PGR reaction. Herein, methylation-sensitive restriction 
endonucleases recognize and cleave a specific sequence within the DNA if C bases within the 
recognition sequence are not methylated. If C bases wifliin Hxe recognition sequence of the 
restriction endonuclease are methylated, the DNA will not be cleaved. Examples of such 
methylation-sensitive restriction endonucleases include, but are not limited to HpaH, Smal, 
SacH, EagI, Mspl, BstUI, and BssHIL In this technique, a recognition sequence for a 
methylation-sensitive restriction endonuclease is located within the template DNA, at a position 
between the forward and reverse primers used for the PGR reaction. In the case that a C base 
within the methylation-sensitive restriction endonwlease recognition sequence is not 
methylated, fte endonuclease will cleave the DNA template and a PGR product wiU not be 
formed when the DNA is used as a template in the PGR reaction. In the case that a G base 
within the methylation-sensitive restriction endonuclease recognition sequence is methylated, 
the endonuclease will not cleave the DNA template and a PGR product will be formed when the 
DNA is used as a template in the PGR reaction. Therefore, methylation of G bases can be 
determined by the absence or presence of a PGR product (Kane, et al., 1997, Gancer Res, 
57:808-1 1). No sodium bisulfite is used in tiiis technique. 

Yet another exemplary method for detecting methylation of the SLG5A8 DNA is called 
the modified MSP, which method utilizes primers that are designed and chosen such that 
products of tiie MSP reaction are susceptible to digestion by restriction endonucleases, 
dependmg upon whether the compound-converted template sequence contains CpG 
dinucleotides or UpG dinucleotides. 

Yet other methods for detecting methylation of the SLC5A8 DNA include the MS- 
SnuPE methods. This method uses conq)Ound-converted SLC5A8 DNA as a template in a 
primer extension reaction whesrein the primers used produce a product, dependent upon whether 
the compound-converted template contains GpG dinucleotides or UpG dinucleotides (see e.g., 
Gonzalgo, et al., 1997, Nucleic Acids Res., 25:2529-31). 
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Another exemplary method for detecting methylation of the SLC5A8 DNA is called 
COBRA (i.e., combined bisulfite restriction analysis). This method has been routinely used for 
DNA methylation detection and is well known in the art (see, e.g., Xiong, et al., 1997, Nucleic 
Acids Res,15:2S32-A). 

In certain embodiments, the invention provides metiiods that involve directly sequencing 
the product resulting from an MSP reaction to detennine if the compound-converted SLC5A8 
template sequence contains CpG dinucleotides or UpG dinucleotides. Molecular biology 
techniques such as directly sequencing a PGR product are well known in the art. 

Xn . SLC5A8 Oligonucleotides forMeth^ation Detection 

In yet other aspects, the apphcation provides oligonucleotide primers for amplilying a 
region within the SLC5A8 nucleic acid sequence of any one of SEQ ID NOs: 5-11. In certain 
aspects, a pair of the oligonucleotide primers (for example, SEQ ID NOs: 5-7) can be used in a 
detection assay, such as the Hpall assay. In certain aspects, primers used in an MSP reaction 
can specifically distinguish between methylated and non-methylated SLC5A8 DNA, for 
example, SEQ ID NOs: 8-11. 

The primers of the invention have sufficient length and appropriate sequence so as to 
provide specific initiation of amplification of SLC5A8 nucleic acids. Primers of the invention 
are designed to be "substantially" complementary to each strand of the SLC5A8 nucleic acid 
sequence to be amplified. While exemplary primers are provided in SEQ ID NOs: 5-11, it is 
understood that any primers that hybridizes with the bisulfite-converted SLC5A8 sequence of 
SEQ ID NOs: 12-14 are included within the scope of this invention and is useful in the method 
of the invention for detecting methylated nucleic acid, as described. Similarly, it is understood 
tiiat any primers that would serve to amplify a methylation sensitive restriction site or sites 
within ttie differentially methylated region of SEQ ID NOs: 12-14 are included within the scope 
of this invention and is usefiil in the method of the invention for detecting nucleic methylated 
nucleic acid, as described. 

The oligonucleotide primers of the invention may be prepared by using any suitable 
method, such as conventional phosphotriester and phosphodiester methods or automated 
embodiments thereof. In one such automated embodiment, diethylphosphoramidites are used as 
starting materials and may be synthesized as described by Beaucage, et al. {Tetrahedron Letters^ 
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22:1859-1862, 1981). One method, for synthesizing oligonucleotides on a modified solid 
support is described in U.S. Patent No. 4,458,066. 

In particular, a pair of primers are selected to amplify the SLC5A8 methylation target 
region or a DNA segment thereof The targeted DNA segment that is amplified by the primers 
contains a pluraUty of sites that are recognized by the methylation sensitive restriction enzyme 
and is located beUveen base pairs 82200 to 83267 of GenBank entry AC063951. In one 
preferred embodiment, the targeted DNA segment comprises at least four HpaH sites and the 
primers ampUfy a region including base pair 82638 through base pair 83080 of GenBank entry 
AC063951. In another highly preferred embodiment, the targeted DNA segment comprises at 
least six HpaH sites and the primers amplify a region including base pair 82430 through base 
pair 83080 of GenBank entry AC06395 1 . 

For example, each primer comprises a single-stranded DNA fragment which is at least 8 
nucleotides in length. Preferably, the primers are less than 50 nucleotides in length, more 
preferably from 15 to 35 nucleotides in length. The sequences of the primers are derived from 
flie sequence of the targeted DNA segment, i.e., the segment that is to be amplified. The 
sequence of the forward primer is identical to a sequence at the 5' end of the targeted DNA 
segment The sequence of the reverse primer is the reverse complement of a sequence at the 3' 
end of targeted DNA segment. 

Xin. Subjects and Samples 

Jji certain aspects, the invention relates to a subject suspected of having or has a 
SLC5A8-associated disease, such as colon cancer, breast cancer, thyroid cancer, or stomach 
cancer. Alternatively, a subject may be undergoing routine screening and may not necessarHy 
be suspected of having such a SLC5A8-associated disease or condition. In a preferred 
embodiment, the subject is a human subject, and the SLC5A8-associated disease is colon 
neoplasia. 

Assaying for SLC5A8 markers discussed above in a sample from subjects not known to 
have a cancer (e.g., colon cancer, breast cancer, thyroid cancer, or stomach cancer) can aid in 
diagnosis of such a cancer in the subject. To illustrate, detecting the methylation status of the 
SLC5A8 nucleotide sequence by MSP caa be used by itself, or in combination with other 
various assays, to improve the sensitivity and/or specificity for detecting a cancer. Preferably, 
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such a detection is made at an early stage in the development of cancer, so that treatment is more 
likely to be effective. 

In addition to diagnosis, assaying of a SLC5A8 marker in a sample ftom a subject not 
known to have a cancer (e.g., colon cancer, breast cancer, thyroid cancer, or stomach cancer) can 
be prognostic for the subject (e.g., indicating the probable course of the disease). To illustrate, 
subjects having a predisposition to develop colon neoplasia may possess methylated SLC5A8 
nucleotide sequences. Assaying of SLC5A8 markers in a samples from subjects can also be 
used to select a particular therapy or therapies which are particularly effective against the colon 
neoplasia in the subject, or to exclude therapies that are not likely to be effective. 

Assaying of SLC5A8 markers in samples from subjects that are known to have, or to 
have had, a cancer associated wifli silencing of the SLC5A8 gene is also useful. For example, 
the present methods can be used to identify whether therapy is effective or not for certain 
subjects. One or more samples are taken from the same subject prior to and following therapy, 
and assayed for the SLC5A8 markers. A finding that the SLC5A8 marker is present in the 
san5)le taken prior to therapy and absent (or at a lower level) after ther^y would indicate that 
the thei^y is effective and need not be altered. In those cases where the SLC5A8 marker is 
present in the sample taken before flierapy and in the sample taken after therapy, it may be 
desirable to alter the therapy to increase the likelihood that ihe cancer will be eradicated in the 
subject. Thus, the present method may obviate the need to perfonn more invasive procedures 
which are used to determine a patient's response to therapy. 

Cancers frequently recur following therapy in patients with advanced cancers. In this 
and other instances, the assays of the invention are useful for monitoring over time the status of 
ancancer associated with silencing of the SLC5A8 gene. Forsubjects m whicha canceris 
progressing, a SLC5 AS marker may be absent from some or all samples when the first sample is 
taken and then appear in one or more samples when the second sample is taken. For subjects in 
which cancer is regressing, a SLC5A8 marker may be present in one or a number of samples 
whenthe first sample is taken and thenbe absentinsomeorallofthese samples wheathe 
second sample is taken. 

Samples for use with die methods described herein may be essentially any biological 
material of interest. For example, a sample maybe a bodily fluid sample firom a subject, a tissue 
sample from a subject, a soUd or semi-soUd sample from a subject, a primary cell culture or 
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tissue culture of materials derived from a subject, ceUs from a ceU line, or medium or other 
extracellular material fiom a cell or tissue culture, or a xenograft (meaning a sample of a cancer 
from a first subject, e.g., a human, that has been cultured in a second subject, e.g., an immuno- 
compromised mouse). The term "sample" as used herein is intended to encompass both a 
biological material obtained directly from a subject (which may be described as the primary 
sample) as well as any manipulated forms or portions of a primary sample. A sample may also 
be obtained by contacting a biological material with an exogenous liquid, resulting in the 
production of a lavage liquid containing some portion of the contacted biological material. 
Furthermore, the term "sample" is intmded to encompass the primary sample after it has been 
mixed with one or more additive, such as preservatives, chelators, anti-clotting factors, etc. 

In certain embodiments, a bodily fluid sample is a blood sample. In this case, the tenn 
"sample" is intended to encompass not only the blood as obtained directly from the patient but 
also fractions of the blood, such as plasma, serum, cell fractions (e.g., platelets, erythrocytes, 
and lymphocytes), protein preparations, nucleic acid preparations, etc. In certain embodiments, 
a bodily fluid sample is a urine sample or a colonic effluent sample. In certain embodunents, a 
bodily fluid sample is a stool sample. 

A subject is preferably a human subject, but it is expected that the molecular markers 
disclosed herein, and particularly their homologs from other animals, are of similarutiiity in 
other animals. In certain embodiments, it may be possible to detect a SLC5A8 marker directly 
in an organism without obtaining a separate portion of biological material. In such instances, the 
term "sample" is intended to encompass that portion of biological material that is contacted with 
a reagent or device involved in the detection process. 

In certain embodiments, DNA which is used as the template in an MSP reaction is 
obtained from a bodily fluid sample. Examples of preferred bodily fluids are blood, serum, 
plasma, a blood-derived fraction, stool, colonic effluent or urine. Other body fluids can also be 
used. Because they can be easily obtained from a subject and can be used to screen for multiple 
diseases, blood or blood-derived fractions are especially useful. For example, it has been shown 
that DNA alterations in colorectal cancer patients can be detected in the blood of subjects (Hibi, 
et al., 1998, Cancer Res, 58:1405-7). Blood-derived fractions can comprise blood, serum, 
plasma, or other fractions. For example, a cellular fraction can be prepared as a '*buffy coat" 
(i.e., leukocyte-enriched blood portion) by centrifuging 5 ml of whole blood for 10 min at 800 
+;*v,«c rrro^rity at room temperature. Red blood cells sediment most rapidly and are present as the 
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bottom-most fraction in the centrifuge tube. The bufify coat is present as a thin creamy white 
colored layer on top of the red blood cells. The plasma portion of the blood fomis a layer above 
the bufi^r coat. Fractions from blood can also be isolated in a variety of other ways. One 
method is by taking a fiaction or j&actions from a gradient used in centrifiigation to enrich for a 
specific size or density of cells. . 

DNA is then isolated from samples from the bodily fluids. Procedures for isolation of 
DNA from such samples are well known to those skilled in the art. C ommonly, such DNA 
isolation procedures comprise lysis of auy cells present in the samples using detergents, for 
example. After cell lysis, proteins are commonly removed from the DNA using various 
proteases. RNA is removed using RNase. The DNA is then commonly extracted with phenol, 
precipitated in alcohol and dissolved in an aqueous solution. 

XIV. Therapeutic methods for SLC5A8-associated diseases. 

Yet another aspect of this application pertains to methods of treating a SLC5A8- 
associated disease (e.g., a proliferative disease such as cancer) which arises from reduced 
expression or over-expression of the SLC5A8 gene in cells. In certain cases, such SLCSAS- 
associated diseases (for example, colon cancer, breast cancer, thyroid cancer, or stomach cancer) 
can result from a wide variety of pathological cell proliferative conditions. In' certain 
embodiments, treatment of a SLC5 A8-associated disorder iucludes modulation of the SLC5 AS 
gene expression or SLC5A8 activity. The term "modulate" envisions the suppression of 
expression of SLC5A8 when it is over-expressed, or augmentation of SLC5A8 expression when 
it is under-expressed. 

In an embodiment, the present invention provides a therapeutic method by using a 
SLCSA8 gene construct as a part of a gene therapy protocol, such as to reconstitute the fimction 
of a SLC5A8 protein (e.g., SEQ ID NO: 1) in a cell in which the SLC5A8 protein is mis- 
expressed or non-expressed. ToiUustrate, cell types which exhibit pathological or abnormal 
growth presumably depend at least in part on a function of a SLC5A8 protein. For example, 
gene therapy constructs encoding the SLC5A8 protein can be utilized in a cancer that is 
associated with silencing of the SLC5A8 gene, such as colon cancer, breast cancer, thyroid 
cancer, or stomach cancer. 

In certain embodiments, the invention provides therapeutic methods using agents which 
* * expression of SLC5A8. Loss of SLC5A8 gene expression in a SLC5A8-associated 
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diseased ceUs may be due at least in part to mefliylation of the SLC5A8 nucleotide sequence, 
methylation suppressive ^ents such as 5-deoxyazacytidine or 5-azacytidine can be introduced 
into the diseased cells. Other similar agents will be known to those of skill in the art. In a 
preferred embodiment, the SLC5A8-associated disease is colon neoplasia associated with 
increased methylation of SLC5A8 nucleotide sequences. 

The present invention also provides gene therapy for the treatment of proliferative or 
immunologic disorders which are associated with SLC5A8. Such therapy would achieve its 
therapeutic effect by introduction of the SLC5A8 polynucleotide encoding full-length SLC5A8 
into diseased cells. 

DeUvCTy of the SLC5A8 polynucleotide or the SLC5A8 gene can be achieved using a 
recombinant expression vector such as a chimeric virus or a colloidal dispersion system. 
Especially preferred for ftierapeutic delivery of antisense sequences is the use of targeted 
l^osomes. Various viral vectors which can be utilized for gene tiier^y as taught herein include 
adenovirus, herpes virus, vaccinia, or, preferably, an RNA vims such as a retrovirus. Preforably, 
the retroviral vector is a derivative of a murine or avian retrovirus. Examples of retroviral 
vectors in which a single foreign gene can be inserted include, but are not limited to: Moloney 
murine leukemia virus (MoMuLV), Harvey murine sarcoma virus CHaMuSV), murine manmiary 
tumor virus (MuMTV), and Rous Sarcoma Virus (RSV). Preferably, when the subject is a 
human, a vector such as the gibbon ^e leukemia virus (GaLV) is utilized. A numbar of 
additional retroviral vectors can incorporate multiple genes. All of these vectors can transfer or 
incorporate a gene for a selectable marker so that transduced cells can be identified and 
generated. By inserting a SLC5A8 sequence of interest into the viral vector, along with another 
gene which encodes tihe ligand for a receptor on a specific target cell, for example, the vector is 
target-specific. Retroviral vectors can be made target-specific by attachmg, for example, a 
sugar, a glycolipid or a protdn. Preferred targeting is accomplished by using an antibody to 
target Ihe retroviral vector. Those skilled in the art will know of, or can readily ascertain 
without u ndue e xperimentation, s pecific p olynucleotide s equences w hich c an b e i nserted i nto 
tiie retroviral genome or attached to a viral envelope to allow target-specific deMvery of the 
retroviral vector containing the SLCSA8 gene. 

The invention also relates to a medicament or pharmaceutical composition comprising a 
SLC5A8 5' flanking polynucleotide or a SLC5A8 5' flanking polynucleotide operably linked to 
PT r«c A g structural gene, respectively, in a pharmaceutically acceptable excipient or medium 
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whCTein the medicament is used for therapy of SLC5A8-associated diseases, such as colon 
cancer, breast cancer, thyroid cancer, or stomach cancer. 

Exemplification 

. The invention now being generally described, it will be more readily understood by 
reference to the following examples, which are included merely for purposes of illustration of 
certain aspects and embodiments of the present invention, and are not intended to limit the 
invention. 

Abstract: 

We identify a new gene, SLC5A8, and show it is a candidate tumor siq>pressor gene 
whose silencing by aberrant methylation is a common and early event in human colon neoplasia. 
Aberrant DNA melhylation has been inq>hcated as a conqjonent of an epigenetic mechanism 
that sUences genes in human cancers. Using restriction landmark genome scanning, we 
performed a global search to identify new genes that would be abenantly methylated at high 
frequency in human colon cancer. From among 1,231 genomic Not! sites assayed, site 3D41 
was identified as methylated in 11 of 12 colon cancers profiled. Site 3D41 mapped to exon 1 of 
SLC5A8, a novel transcript that we assembled. In normal colon mucosa we found SLC5A8 
exon 1 is unmethylated, and SLC5A8 transcript is expressed. Ih contrast, SLC5A8 exon 1 proved 
abetrantiy methylated in 59% of primary colon cancers and 52% of colon cancer cell lines. 
SLCSA8 exon 1 meftylated cells were uniformly silenced for SLC5A8 expression, but 
reactivated expression upon treatment with a demethylating drug, 5-azacytidine. Transfection of 
SLC5A8 suppressed colony growth in each of three SLC5A8 deficient cell lines, but showed no 
suppressive effect in any of three SLC5A8 proficient cell lines. SLCSA8 exon 1 methylation is 
an early event, detectable m colon adenomas, and in even earlier microscopic colonic aberrant 
crypt foci. Structural homology and functional testing demonstrated SLC5A8 is a novel 
member of the family of sodium solute syn^orters, which are now added as a new class of 
candidate colon cancer suppressor genes. 

Tntroduction: 

Cytosine methylation within CpG dinucleotides is a recognized epigenetic DNA 
modification, which in normal human tissues is excluded firam CpG rich "ishmds" that mark the 
promoters of certain genes (Baylin, et al., 1998, Adv Cancer Res 72:141-96; Jones, et al., 1999, 
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Trends Genet 15: 34-7; Baylin, et al., 2002, Cancer Cell 1: 299-305). Global hypomethylation 
accompanied by aberrant focal CpG island hypermethylation has emerged as one of the 
signature alterations evidenced by the cancer genome (Baylin, et aL, 1998, Adv Cancer Res 
72:141-96; Jones, et al., 1999, Trends Genet 15:34-7; Baylin, et al., 2002, Cancer Cell 1:299- 
305; Feinberg, et al., 1983, Nature 301:89-92). Moreover, silencing of gene expression as 
marked by aberrant methylation of CpG island promoter regions has emerged as a novel 
mechanism for the inactivation of tumor suppressor genes that provides an alternative to either 
mutation or to allelic loss (Baylin, et al., 1998, Adv Cancer Res 72:141-96; Jones, et al., 1999, 
Trends Genet 15:34-7; Kane, et al., 1997, Cancer Res 57:808-1 1; Veigl, et al., 1998, Proc Natl 
Acad Sci U S A 95:8698-702). Additionally, aberrant methylation of defined genomic 
sequences can serve as a potentially useful diagnostic marker for detection of human cancers 
(Grady, et al., 2001, Cancer Res 61:900-2; Usadel, et al., 2002, Cancer Res 628:371-5). 

Restriction landmark genome scanning (RLGS) provides a global analysis of 
methylation events in a cancer cell by providing a two dimensional display of the methylation 
status of genomic Notl sites (Costello, et al., 2000, Nat Genet 24:132-8). To identify new tumor 
suppressor genes and /or identify new genes targeted for methylation in human colon cancer, we 
carried out RLGS analysis of 12 colon cancer cell lines. This analysis lead to the identification 
of a novel transcript SLC5A8, whose aberrant methylation and transcriptional silencing was 
found to be a common and early event in human colon cancers, and that was found to encode a 
novel sodium symporter whose restoration can markedly suppress colony forming ability of 
colon cells m which endogenous SLC5A8 has been inactivated. 

<;ip;nificance: 

This study demonstrates the application of restriction landmark genome scanning to 
identify a novel high firequency aberrant methylation event in human colon cancer. We extend 
that observation to identify a novel sodium transporter, SLC5A8, silenced by the methylation 
event SLC5A8 methylation is among the most jfrequent molecular alterations in colon cancer, 
and finding SLC5 A8 is a growth suppressor adds sodium transporters as a new functional class 
that can act as tumor suppressors. Moreover, detecting SLC5 AS methylation in aberrant crypt 
foci demonstrates this event as one of the earliest molecular changes m colon neoplasia, and 
adds further molecular support to the model in which at least some aberrant crypt foci are able to 
progress to more advanced colon adenomas and cancers. 
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HTamolel: 

Figure 3 depicts certain aspects of the present invention. The numerical coordinates are 
those of genomic clone AC063951. LolUpops designate CpG sites that are potential acceptors of 
aberrant methylation. Asterisks designate sites recognized by the HpaH restriction enzyme that 
cut these sites if nnmethylated, but not if methylated. Shown are the positions of PGR primers 
that amplify regions crossing 6 HpaH sites, or regions crossing 4 Hpall sites. Also shown is the 
position of PGR primers designed for a methyl-specific PGR (MS-PCR) assays that amplify 
sodium bisulfite converted DNA specifically derived &om templates that are either methylated 
or unmethylated at CpG dinucleotides mterrogated by the PGR primers. Also shown in the gray 
bar is the 5' end of exon 1 of the SLC5A8 transcript which overlaps with the methylation sites 
detected m both MS-PCR and Hpan based assays. Lastly indicated is a site corresponding to 
methylation site 2D41 detected in Restriction Landmark Genome Scanning assay as methylated 
in colon cancer cell lines, though not in primary tumors. 

Colon cancers tiiat are abeirantly methylated can be detected as they are resistant to 
cutting by tiie Hpall enzyme. That is methylation in a colon cancer can be assayed by showing 
PGR amplification of a DNA product using the primers and conditions shown fiom DNA that 
has first been digested with the Hpall restriction razyme. The assay is diagrammed in Figure 4 
that provides the sequence of AC063951 between base pairs 82200-83267, and designates every 
CpG site wi& a gray loUipop, and shows the Hpall sites in flie assay as black lollipops, and also 
shows the location of the PGR primers used in this assay. In this figure, the base pairs have been 
renumbered sequentiaUy from 1-1068, with hasepm 82200 being renumbered as basepair 1. 

Figure 5 tabulates the correspondence of assay for methylation over 4 and 6 HpaH sites 
with silencing of expression of the SLC5A8 transcript. As noted, assay of methylation over 4 
Hpan sites detects 100% of colon cancer cell lines that silence the SLC5A8 transcript, but also 
detects some colon cancer cell Imes that express SLC5 A8. Assay of methylation over 6 HpaH 
sites has 100% specificity and detects only cell lines that have silenced SLC5A8, with a 
sensitivity of 68%. 

Figure 6 tabulates the results of this assay in actual colon cancer tumors, hi a group of 
34 human colon cancers 76% are detected by resistance to cutting at 4 HpaH sites whereas 50% 
are detected by resistance to cutting at 6 Hpall sites. Both assays detect methylation in some 
normal tissues accompanying methylated cancers, suggesting tiie detection of microscopic colon 
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cancer cells. No methylation is detected in any normal tissue in which the accompanying tumor 
is nmnethylated. Because of its high specificity, the assay which employs methylation over 6 
Hpall sites is preferred. 

Figure 7 shows the results of assay for methylation at 61 CpG sites numerated in Figure 
4 with site 1 corresponding to basepair 466 in Figure 4 and site 61 corresponding to basepair 
1010. The bold arrows correspond to 4 of the HpaU sites at respectively basepairs 466. 691. 709 
and 716 in Figure 4. Methylation was assayed by sequencing DNA from samples following 
sodium bisulfite treatment of DNA that converts cytosine to uracil but leaves methyl-cytosine 
unchanged. Bases that are methylated are coded black, uranethylated bases are coded darker 
gray, and samples with both methylated and uiraiethylated bases are coded lighter gray. 
Samples analyzed included 9 colon cancer cell lines that do not show SLC5A8 transcript 
expression, 3 colon cancer cell lines that express SLC5A8 transcript, and 6 normal colon tissues. 
Clearly most colon cancers show substantially more methylation across this region than do 
normal colon tissues. 

To detect the methylation associated with colon cancer a set of methylation ^ecific PGR 
primers were fashioned. DNA ftom the assayed tissues was first treated with sodium bisulfite to 
convert cytosine to uracil, leaving methyl-cytosine unchanged. PGR primers were designed 
specific for the bisulfite converted sequences arising &om methylated or unmethylated templates 
from the anti-sense strand of the target region (note that after bisulfite conversion the sense and 
anti-sense strands are no longer complementary to one another). 

Figure 8 shows the wild-type sequence of the anti-sense strand of AC063951 between 
bases 82200-83267. Indicated on this diagram is the position of the MS-PCRl primers (AS- 
meth) and the UMS-PCRl primers (AS-unmethy). The methyl specific MS-PCRl primers 
amplify a CpG sites numbered 6. 7. 8 and 15, 16, 17, 18 respectively m Figure 7. The UMS- 
PCRl primers intem)gate CpG sites 7, 8 and 15, 16, 17. 18 respectively. 

Figure 9 shows a blow up of the region and the sequences of the antisense strand that ate 
amplified by the methyl-specific and umnethyl-spedfic PGR primers. 

Figure 10 corresponds to Figure 8, but does not show the wild-type sequence of the anti- 
sense strand, but the bisulfite converted sequence of a uniformly methylated antisense strand. 
Indicated again are the position of the methylation specific PCR primers for the MS-PCRl 
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Figure 1 1 also coiresponds to Figure 8, but does not shows the wild-type sequence of the 
antisense strand, but the bisulfite converted sequence of a unifonnly unmethylated antisense 
strand. Indicated are the position of the unmethylation specific PGR primers for the UMS-PCRl 
assay. 

Figure 12 discloses the bisulfite converted sequence of the unmethylated sense strand of 
nucleotides 82200-83267 of AC063951, renumbered such that basepair 82200 is designated as 
nucleotide 1. 

Figure 13 similarly discloses the bisulfite converted sequence of a uniformly methylated 
sense strand of nucleotides 82200-83267. To one skiUed in the art these disclosures would 
permit design of methylation specific PGR primers directed against the bisulfite converted 
sequences of either the sense or antisense strands of the region 82200-83267 demonstrated 
herein as enabling the detection of human colon cancers. 

Figure 14 shows the tabular results of MS-PCRl assay performed on 31 colon cancer 
cell lines that do or do not express the SLC5A8 transcript. 70% of cell lines that do not express 
SLC5A8 score as methylated in the MS-PCRl assay. No methylation is detected in any cell line 
that expresses SLC5A8 (100% specificity for prediction of SLC5A8 expression). 

Figure 15 shows the tabular results of MS-PCRl assay performed on 63 matched sets of 
primary colon cancer tumor tissue and accompanying nonnal colon tissue. The assay detects 
59% of all colon cancers. No methylation was detected in any of 26 normal tissues from 
patients with unmethylated colon cancers. 3 individuals with MS-PCRl positive methylation 
assays in their cancers also showed positivity in their normal colon tissue. It is likely that this 
represents detection of microscopic contamination of these tissues by tumor cells. 

To further test that assertion. Figure 16 gives the results of testing 12 normal colon 
tissues from individuals without colon cancer. None of the tissues test positive m the MS-PCRl 
test We therefore estimate the sensitivity of MS-PCRl for detecting colon cancer at 59% and 
the specificity at 100%. 

Figure 17 gives the tabular results of the MS-PCRl assay of 28 premalignant colon 
adenomas, 68% of which are detected. 



-74- 



wo 03/104427 



PCT/US03/18239 



Figure 19 shows RT-PCR detection of the SLC5A8 transcript in nonnai colon and in a 
minority subset of colon cancer cell lines, but also demonstrates that 23 of 31 colon cancer cell 
lines do not express SLC5A8. 

Figure 20 shows RT-PCR detection of SLC5 A8 transcript m colon cancer cell lines that 
have been treated with the DNA-demethylating agent 5-azacytidine. 5-azacytidine reactivates 
expression of the SLC5 AS gene in 6 of 8 colon cancer cell lines, strongly consistent with DNA 
methyiation as the cause of silencing of the SLC5A8 transcript. 

Figure 21 demonstrates detection of methyiation of the SLC5A8 locus by showmg 
resistance ofthe locus to Hpan digestion. The4 HpaH assay (as described in the invention 
disclosure) is based on PGR amplification of a portion of the SLC5 A8 locus. Lanes labeled U 
show control amplification of undigested SLC5A8 DNA. Lanes labeled M show ampUfication 
of DNA that has first been cut with the restriction enzyme Mspl. Mspl digestion of the DNA 
eliminates the ability to amplify the locus. Lanes labeled H show amplification of DNA that has 
first been cut with the restriction enzyme Hpall. HpaH cuts the same sequence as Mspl, but 
unlike Mspl, HpaH is blocked by DNA methyiation. The presence of amplified HpaH cut DNA 
indicates methyiation ofthe DNA in cell lines V5, V6, RKO, V432, HCT116, V5, V6, V489. 

Figure 22 demonstrates detection of SLC5A8 DNA methyiation in primary colon cancer 
tumors but not in matched nonnai tissue from the same patients. Samples labeled T represent 
colon cancer tumor tissue; whereas samples labeled N represent the matched normal tissue. 
Detecting a PGR amplified band after HpaH digestion (lanes labeled H) indicates methyiation 
of die SLC5A8 locus. Methyiation of tumor but not normal tissue is seen in samples 529, 365, 
and 23-21. 

Example 2: 

A. Identification ofthe SLC5A8 gene. 

Methyiation events in genomic DNA firom 12 colon cancer cell lines were profiled by 
restriction landmark genomic scanning. Out of 1,23 1 unselected CpG islands visualized, spot 
3D41 was detected as absent and presumptively methylated in 1 1 ofthe 12 colon cancer cell 
lines. A 5 10 base pair genomic firagmeat surrounding the 3D41 site was cloned and shown to 
correspond to genomic sequence on human chromosome 12q22-23. RNA from normal human 
colon mucosa was used for connection RT-PCR that linked together over 10 EST sequences 
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mapping to this genomic region. New sequence was generated both by sequencing of these RT- 
PCR amplified products, as well as by sequencing image clones corresponding to these ESTs 
(Figure 28). This established that the 3D41 site was included within a new transcript encoded 
by a novel gene (Figure 23B). This gene, located on chromosome 12q22-23 gene, is comprised 
of 15 exons, with flie site &om RLGS located in exon 1 (Figure 23 A). The newly identified 
transcript includes an in fi:ame TAA stop codon 5* to the presumptive ATG start codon, which 
additionally is embedded within a GCCATGG sequence that conforms to the standard for a 
good Kozak sequence. BLAST aligmnent of the predicted protein product of this novel 
transcript showed the most closely related proteins to be the human sodium iodide symporter- 
SLC5A5 (46% homology) and the human sodium-dependent multivitamin transporter- SLC5 A6 
(43% homology), both of which belong to the solute carrier 5 family (SLC5) of sodium coupled 
transporters (Figure 29). Moreover, analysis of the predicted novel protein by the TMHMM 
prediction program (http:/Avwwxbs.dtu.dk/services/TMHMM/) identified 13 transmembrane 
j&agments, which is consistent with structural features of the sodium iodide symporter. Thus 
stracturally, this new transcript encodes a novel member of the SLC5 sodium solute symporter 
family (SSF) family, and HUGO assigned the encoded protein the name of SLC5A8. A mouse 
protein of unknown Amotion shows 77% identity to SLC5A8, and is likely the mouse homologs 
of the human protein (Figure 29). RT-PCR confirmed SLC5A8 transcript was expressed by 
normal colon mucosa, as well as by kidney, lung, esophagus, small bowel, stomach, thyroid, and 
uterus, with greatest expression seai in kidney. 

B. SLC5 A8 is frequently silenced and methylated in colon cancer cell lines. 

RT-PCR was used to fiarther charactOTze SLC5 A8 expression in normal colon mucosa 
compared to a collection of 31 colon cancer cell lines. Whereas tiie SLC5A8 transcript was well 
expressed in normal colon, it proved absent in 23 of tiie 3 1 colon cancer cell lines (Figure 24A). 
The methylation of SLC5 AS exon 1 detected by RLGS suggested the hypothesis that aberrant 
methylation might be the mechanism for silencing of SLC5 AS expression. Consistent with this 
hypothesis, treatment of SLC5A8 silenced cell lines with the demethylating agent S-azacytidine 
reactivated SLC5 AS expression in 6 of S colon cancer cell hnes tested (Figure 24B and data not 
shown). Sequencing of the SLC5A8 transcript in the 8 colon cancer cell lines in which it was 
expressed showed only wild-type sequence with no mutations. Thus methylation, but not 
mutation, appeared to be the putative mechanism for inactivating SLC5 A8 in colon cmc&c. 
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To identify target sequences for aberrant SLC5A8 methylation in colon cancer, we 
investigated a dense CpG island (G+C%=70%, CG/GC=0.9) located in SLC5A8 Exon 1, and 
suiToiinding the 3D41 site. This region covered 573 base pairs and included 62 CpG 
dinucleotides (Figure 30A). In contrast, the region immediately 5' of exon 1 showed only a 
46% G+C content. We used sodium bisulfite treatment of genomic DNA to convert 
unmethylated cytosines to uracil; while leaving methylated cytosmes unchanged (Herman and 
Bayhn, 1998, Current Protocols in Human Genetics, N. E. A. Dracopoli, ed., John Wiley & 
Sons, 2:10.6.1-10.6.10 ). Sequencing of PGR amplified bisulfite converted SLC5A8 exon 1 
genomic DNA was then used to determine the methylation status of each of the 62 target 
cytosines within the CpG island domain. Comparing the findings in nine SLCSAS-silenced cell 
lines versus those in three SLC5A8-expressing cell lines and in six samples of SLC5A8 
expressing normal colon mucosa defined a 182 bp subregion. In the nine SLC5A8-silenced cell 
lines this subregion demonstrated uniform methylation of all CpG cytosines; whereas, these 
cytosines were uniformly urmiethylated in the three SLC5A8 expressmg cell hues and six 
normal colon mucosa samples (Figure BOB). Primers for assay of this subregion by methylation 
specific PGR (MS-PCR) were designed, such that following bisulfite conversion amplification 
products would selectively be derived firom either methylated (M) or munethylated (U) genomic 
templates (Herman and Baylin, 1998, Current Protocols in Human Genetics, N. E. A. Dracopoli, 
ed., John Wiley & Sons, 2:10.6.1-10.6.10). MS-PCR assay of 31 total colon cancer ceU Unes 
demonstrated SLC5A8 exon 1 methylation was present in 16 cases (52%), and in each of these 
methylated cell Knes, no SLC5A8 transcript was detectable (Figure 24C). In contrast, in each of 
the 8 SLC5A8 expressing cell Knes MS-PCR assayed exon 1 as unmethylated (Figure 24D). In 
7 remaining instances, SLC5 A8 expression was absent, but aberrant methylation was not 
detected as the reason. Moreover, in the case of two of the SLC5 A8-methylated cell lines 
(V425 and V670), DNA firom antecedent tumor and matched patient normal tissue was also 
available. In each of these cases, MS-PCR confirmed that SLC5A8 methylation was present in 
the primary tumor tissues, but was absent in the matched normal tissues (Figure 24F). Thus the 
SLC5A8 methylation and silencing detected in colon cancer cell lines reflects somatic 
aberrations present in primary colon cancer tissues. We note that the finding of gene silencing 
associated with aberrant methylation in a first exon region corresponding to 5' untranslated 
sequences has existing precedent at other loci (Attwood et al, 2002, Cell Mol Life Sci 59: 241- 
257; Jones, P. A. 1999, Trends Genet 15: 34-37). 
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In previous studies our group has noted that in colon cancers aberrant methylation of 
hMLHl and of HLTF commonly silences both maternal and paternal alleles in the same tumor 
Veigl, et al., 1998, Proc Natl Acad Sci U S A 95:8698-702; Moinova, et al., 2002, Proc Natl 
Acad Sci U S A 99:4562-7). Consistent with this mechanism, testing of microsatellite markers 
D12S1041 and D12S1727, that flank SLC5A8y showed the presence of two distinguishable 
parental SLC5A8 chromosomal regions in 10 of 10 colon cancer cell lines that showed the 
presence of only methylated SLC5A8 exon 1. 

C. SLC5 A8 methylation is commonly present in primary colon cancers and in colon adenomas. 

To fiirflier establish the frequency of SLC5 A8 exon 1 methylation in primary colon 
cancer tumors, we analyzed by MS-PCR an additional 64 pairs of primary colon cancer tumor 
tissues as well as their accompanying matched normal colon tissues. SLC5 A8 methylation was 
detected in 38 of 64 (59%) primary colon cancers (Figure 24F and Table 2 below). In 35 of 38 
cases (92%) in which colon tumors showed SLC5A8 methylation, this methylation was not 
detected in the same individuals' normal colon tissues. SLC5 A8 exon 1 methylation thus 
substantially arose in these individuals' cancers as part of and during the neoplastic process. In 
3 cases in which SLC5 A8 methylation was detected in both an individuals' cancerous and 
normal colon tissues, these findings likely indicate either the presence of some cancer cells 
within the grossly normal resected tissue, or the possibility that the cancer arose from a field of 
SLC5A8 methylated cells. The rarity of detecting SLC5A8 methylation in normal colon tissues 
is highlighted by noting that no SLC5A8 methylation was detected in any of the 26 normal 
colon tissues in which the accompanying colon cancer was also unmethylated (Table 2 below), 
and moreover, that no SLC5A8 methylation was detected in any of 12 additional normal colon 
tissues from resections done for non-cancer diagnoses. 

Table 2. SLC5A8 Methylation in Colon Tumors and Matched Normal Mucosa Shown is the 
characterization of 64 pairs of colon cancer tumors and matched normal colon tissues assayed 
for methylation of SLC5 A8 exon 1 by MS-PCR. Indicated are the numbers (and percentages) of 
tissue pairs with each of the four possible methylation phenotypes. 

I I I NORMAL TISSUE 



Methylated 



Unmethylated 
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TUMOR 
TISSUE 


Methylated 


3 (5%) 


35 (54%) 


Unmethylated 


0 (0%) 


26(41%) 



Among all primary cancers and cell lines analyzed, the finding of SLC5A8 methylation 
in colon cancer tumors and cell lines was not significantly correlated with either patients' sex 
(P=0.39 ) or age (P=0.52), with a median age of 69 in persons with SLC5A8-methylated cancers 
versus 67 in those with SLC5A8 unmethylated cancers. Moreover, the distribution by tumor 
stage (Dukes' stage B, C, D primary tumor; or metastatic cancer deposit) was not significantly 
different between SLC5A8-methylated and nonmethylated colon cancers (P=0.77 ) (Table 3 
below). SLC5A8 methylated and unmethylated cancers also showed no significant difference 
with respect to site of origin in the rectum, left colon, or right colon 0.47) (Table 4 below). 

Tables. DistributionofSLC5A8 methylation by tumor stage. Shown are numbers (and %) of 
colon neoplasms (tumor and cell lines) in each category defined by clinical stage and SLC5 A8 
methylation status. 



Tumor Stage 


SLC5A8 M^ylated 


SLC5A8 Unmethylated 


Adenoma 


17(24%) 


12 (23%) 


Duke's B 


24 (34%) 


16(30%) 


Duke's C 


15 (21%) 


13 (25%) 


Duke's D 


6(8%) 


5 (9%) 


Metastatic lesion 


7 (10%) 


7 (13%) 



Table 4. Distribution of SLC5A8 methylation by tumor site. Shown are numbers (and %) of 
colon neoplasms (tumor and cell lines) in each category defined by location in the colon and 
SLC5A8 methylation status. 



Tumor site 



SLC5A8 Methylated SLC5A8 Unmethylated 
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Right colon 


12 (23%) 


13 (35%) 


Left colon 


30 (59%) 


20 (54%) 


Rectal 


9(18%) 


4(11%) 



To deteraiine the timing of onset of SLC5A8 silencing during colon carcinogenesis, we 
additionally analyzed a group of 29 adenomas for SLC5A8 exon 1 methylation. SLC5A8 
methylation was detected in 17 of the 29 (59%) adenoma cases. SLC5A8 methylation thus 
appears to be an early event that is already ^ablished in colon neoplasia by the adenoma stage. 

D. Quantitative assay of SLC5A8 exon 1 methylation. 

To derive a quantitative measure of SLC5A8 methylation, we employed a real time MS- 
PGR assay whose results were expressed as 1000 times the ratio of methylated SLC5A8 reaction 
product to a control MYODl reaction product (Usadel, et al., 2002, Cancer Res 62:371-5). In 
this assay, 0 methylation was detected in the Vaco9 SLC5A8 expressing colon cancer cell line, 
and a methylation value of 1000 was detected in the SLC5A8 methylated and silenced RKO 
colon cancer cell line. As shown in Figure 25A, assay for SLC5A8 exon 1 methylation in 11 
normal colon mucosal samples derived from non-cancer resections yielded only barely 
detectable methylation values (mean value= 24; range= 4 -82) and defined an *^l^methylated 
normal range" of values all < 100. Analysis of 29 normal colon samples derived from colon 
cancer resections gave similarly low values with a mean value =02 and with a single outlier 
sample (value =159) falHng outside the range defined by the non-cancer derived normal tissues. 
This observation essentially replicated our previous observation of rare faint methylation events 
detected in some cancer associated normal tissue. In contrast, analysis of colon cancer samples 
clearly distinguished two populations of tumors. Twelve cancers were deemed unmethylated, as 
they showed metiiylation values falling well within the population normal range (mean value 
=12; range = 0-58) (Figure 25A), and hence were indistinguishable fixm unmethylated normal 
tissues. In contrast, 17 cancers with methylation values greater than the normal range comprised 
a distinct "methylated" group of cancers that was characterized by a mean methylation value of 
747 and a range = (121- 2549) (Figure 25A). The mean methylated colon cancer thus displayed 
75% the level of methylation as was measured in a pure cell line population of methylated RKO 
cells. The heterogeneity in measured methylation values among the methylated colon cancers 
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may in part derive from differences among the tumors in levels of contaminating and mtiltrating 
non-cancer cells. The methylated and unmethylated cancer populations defined by real time 
MS-PCR respectively corresponded to the tumors classified as unmethylated and methylated in 
the previous non-quantitated MS-PCR reaction. 

E. Detection of SLC5A8 methylation in aberrant crypt foci. 

The finding of SLC5A8 methylation in colon adenomas prompted us to consider that 
SLC5A8 methylation might be an early event in human colon neoplasia. The earliest 
moiphologically identifiable colon neoplasias putatively are aberrant crypt foci (ACF) (Siu et 
al., 1999, Cancer Res 59: 63-66). These microscopic morphologically aberrant multicrypt 
structures are recognizable in unembedded colon under low power magnification. Moreover, a 
subset of ACF lesions demonstrate both histologic dysplasia and mutations of fbsAPC tumor 
suppressor gene (Bird, 1987, Cancer Lett 37:147-51; Pretlow, et al., 1991, Cancer Res 51:1564- 
7), suggesting that at least some ACF have potential to progress to colon adenomas and cancers. 
To assess a possible role of SLC5A8 methylation in ACF development, 15 ACF, composed of 
ftom 17 to 155 crypts (48±36 crypts, mean ± standard deviation), were dissected fiom 1 1 
different patients* colons bearing either cancer or adenomas. From these same 1 1 cases, 24 
similarly sized tissue samples were dissected from mucosal regions that appeared normal under 
low power magnification. Real time MS-PCR analysis of SLC5A8 methylation in the 24 control 
normal samples gave results similar to those obtained in previous normal mucosal samples, with 
a mean SLC5A8 methylation value of 12, and with only one of these 24 new samples 
(methylation value of 117) falling just outside of the previously determined normal limit of 100 
(Figure 25B). In contrast, analysis of DNA torn flie ACF revealed two distinct populations, 
with 8 of 15 ACF falling wifliin the nonnal range (mean =34, and range =0-113), and with 7 of 
15 ACF samples demonstrating SLC5A8 values that fell well within the range of methylated 
cancers (mean =355, range =287-420) (Figure 25B). In contrast, none of these 15 aberrant crypt 
foci demonstrated aberrant mefliylation of hMLHl, which thus likely arises later during colon 
carcinogenesis. These findings suggest that SLC5A8 methylation is indeed an early aberration 
that precedes adenoma formation and is detectable in aberrant crypt foci. This finding also 
further strengthens the model that suggests a subset of aberrant crypt foci are likely to progress 
to more advanced colonic neoplasms. 

F. SLC5 A8 methylation as a serologic marker of colon cancer. 
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SLC5A8 methylation was detected in 59% of our primary colon samples. In these same 
sanq)les we had previously noted a 44% frequency of methylation of HLTF, a SWI/SNF family 
gene (Moinova et al., 2002, Proc Natl Acad Sci USA 99: 4562-4567), and had also found a 44% 
firequency of methylation of pl6 (Figure 31) (Herman et al., 1995, Cancer Res 55: 4525^530; 
Gonzalez-Zulueta et al., 1995, Cancer Res 55: 4531-4535). These data suggest SLC5A8 
methylation might be a high quality marker of colon cancer presence. In this regard, we and 
others have shown that abeirantly methylated genomic DNA from specific loci can be detected 
in the serum of some cancer patients (Grady et al., 2001, Cancer Res 61 : 900-902; Hibi et al., 
1998, Cancer Res 58: 1405-1407; Jeronimo et al., 2001, J Natl Cancer Inst 93: 1747-1752; 
Usadel et al., 2002, Cancer Res 62: 371-375). Accordingly, we characterized the level of 
SLC5 A8 methylation in ethanol precipitable DNA prepared from the serum of colon cancer 
patients (Grady et al., 2001, Cancer Res 61 : 900-902). SLC5A8 methylation was totally 
undetectable with a measured value of 0 in DNA extracted from each of 13 serum samples from 
individuals with colon cancers in which SLC5A8 assayed as unmethylated (Figure 26). In 
contrast, SLC5A8 methylation was detectable in serum DNA from 4 of 10 patients in which the 
underlying colon cancer assayed as SLC5A8 methylated (Figure 26). A positive signal for 
MYODl verified the presence of input DNA into each of these assays. While serologic assays 
for methylated DNA as a marker of cancer are clearly in the early stages of investigation, we 
note that a panel of methylated genes that included SLC5A8, HLTF, pl6 and hMLHl provided 
greater sensitivity than any single locus alone for detecting an aberrant methylation event in our 
set of 64 primary colon cancers (Figure 31). 

G. SLC5A8 suppression of colon cancer colony formation. 

The high frequency of SLC5 A8 methylation observed in colon cancer suggested that 
inactivation of this gene might confer a selective advantage. To assay for such an advantage, we 
examined the effect of SLC5A8 transfection in three colon cancer cell lines (V400, RKO and 
FET) in which the endogenous SLC5A8 gene was methylated and silenced, as compared with 
three colon cancer cell lines (V457, V9M and V364) in which the endogenous SLC5A8 gene 
remained unmethylated and expressed. Reconstitution of SLC5A8 expression in SLC5A8- 
methylated cells suppressed colony-forming ability by at least 75% in each of the three lines 
tested (PO.Ol) (Figure 27B). In contrast, transfection of SLC5A8 did not show significant 
colony suppression in the any of the three cell lines that akeady expressed an endogenous 
SLC5A8 allele (Figure 27A) (P< 0.01 for the difference in effect of SLC5A8 transfection in 
lethylated versus unmethylated ceD lines). Transient transfection showed that both 
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SLC5A8-methylated and umnethylated cells were able to express comparable levels of 
exogenous SLC5 A8, as determined by western analysis for a V5 epitope tag attached to the 
SLC5A8 cDNA. These findings suggest that SLC5 A8 methylation and silencing confers a 
specific growth advantage in the subset of colon cancers in which this locus is inactivated. 

Consistent with this interpretation, we found that 4 of 5 of the rare SLC5A8 expressing 
clones that grew out following transfection of the SLC5A8 methylated V400 colon cancer cell 
lines were markedly suppressed in their ability to form xaiograft tumors in athymic mice 
(Figure 32). 

H. Discussion. 

In this study, we have identified a novel gene, SLC5A8, that we demonstrate is a new 
candidate colon cancer suppressor gene. We find that SLC5A8 encodes a sodium transporter 
and is a new member of the sodium solute synotporter family (SLC5). SLC5 A8 is firequently 
targeted for methylation and silencing in human colon cancer, with aberrant SLC5 A8 exon 1 
methylation was detected in 52% of colon cancer cell lines and in 59% of primary colon 
cancers. All colon cancer cell lines showed that SLC5 A8 exon 1 methylation were silenced for 
SLC5A8 expression, and SLC5 A8 expression could be restored by treatment with a 
demefhylating agent 5-azacytidine. We therefore conclude that epigenetic gene silencing, which 
is reflected by aberrant SLC5 A8 methylation represKits the principal mechanism for inactivating 
this gene in colon cancer. Moreover, our finding that exogeaous SLC5A8 specifically 
siqipresses colony forming activity in colon cells that have inactivated this allele supports the 
hypothesis that SLC5A8 inactivation confers a selectable advantage in neoplastic colon 
epitheUal cells. Colon cells that retain SLC5 A8 arc insensitive to the introduction of an 
exogenous allele, and presumably bear a mutation elsewhere that renders them tolerant to 
continued SLC5 A8 expression. Also supporting that SLC5A8 methylation is a pathogenetic 
event in colon neoplasia is our finding that SLC5A8 methylation is a highly early event that is 
detectable in 47% of aberrant crypt foci, which are the earliest detectable morphologic 
abnormality of the colon epithelium. 

SLC5 A8 methylation may also play an etiologic role in malignancies additional to colon 
cancer. In earher studies, we note that SLC5 A8 methylation is present in a subset of cancers of 
the breast and stomach cancers (Table 5 below). 
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Tables. SLC5A8 mefliylation in additional cancers. Shows are the results of MS-PCR assay 
for SLC5A8 exon 1 methylation in primary human tumors. In each case, paired normal tissue 
assayed as unmefhylated. 



Cancer Types 


Breast 


Stomach 


Kidney 


SLC5A8 
methylated 


4 


4 


0 


SLC5A8 
unmethylated 


16 


2 


7 



Both molecular homology and functional data suggest fhat SLC5A8 functions as a 
sodium solute symporter. There are 109 currently known members of the sodium solute 
symporter fiamily which functions to co-transport sodium coupled to solutes as diverse as iodine 
(NIS/SLC5A5), glucose (SGLT1/SLC5A1; SGLT2/SLC5A2), inositol (SMIT/SLC5A3), and 
water soluble vitamins (SMVT/SLC5A6) (Smanik et al., 1996, Biochem Biophys Res Commun 
226: 339-345; Prasad et al., 1998, J Biol Chem 273: 7501-7506; Wright et al., 1994, J Exp Biol 
196: 197-212). Elucidating the putative solute cotransported by SLC5A8 may provide future 
insight both into the mechanism of SLC5A8 growth suppression, as well as leads for potential 
development of novel agents useful for colon neoplasia prevention and treatment 

Materials And Methods 

Sequences. Human SLC5A8 mRNA and gene sequence accession numbers as deposited 
by our group are AF53621 and AF536217. The SLC5A8 murine homolog is accession number 
is BCO 17691. Contemporaneously with our Genbank entry, SLC5A8 mKNA sequence was also 
independently deposited under accession number AY081220 (Rodriguez et al., 2002, J CHn 
Endocrinol Metab. 87:3500-3). 

Restriction Landmark Genomic Scanning (RLGS). RLGS was performed as previously 
described (Costello et al., 2000, Nat Genet 24: 132-138). 

Amplification and Sequencing of SLC5A8. The primers used for RT-PCR assay of a 
SLC5A8 fragment are 5'-TCCGAGGTCTACCGnTTG-3', and 5'-GGGCA GGGGC ATAAA 
* * ^ . , The PGR parameters were 35 cycles of 95°C (45s), 54X (45s), 72°C (60s), 72°C 
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(lOmin), and 4°C to cool. The full length SLC5A8 ORF was amplified using primers: 5 
TCCGGGATAAGAAGTGCG-3' and5*-TAGTATCAGAGCAGCTTCACAAAC-3'. GC-rich 
cDNA polymerase kit (Clonetech) was used and PGR parameters were 35 cycles of 95°C (45s), 
62°C (45s), ll'C (90s), ll'C (lOmin), and 4'»C to cool. Sequencing primers were: S'-TTTGT 
GGTGGTCATCAGCG-3', 5'-GGGCAGGGGCATAAATAAC-3% 5'-AGGCTGTG 
GTGATGCAAGGT-3'. 5'-TTAATGCCTrAGCAGCAG-3', and 5'-CCTCCACTT 
CCTGAGAGAAC-3'. 

Constructs. To construct the V5 tagged SLC5A8 expression vector, the following PGR 
primers were used: 5'-TCCGGGATAAGAAGTGCG-3' and 5'-TCTAGTATCA 
GAGCAGCTACACAA-3'. The PGR conditions were the same as employed for amplification 
of the fuU length ORF. PGR products were cloned into pcDNA3. W5-ffis-TOPO vector 
Cfavitrogen). 

Serum DNA purification. Blood was drawn into red/grey vacutainer collection tubes and 
allowed to clot for 2 hours. It was then spun in a clinical table top centrifuge for 15 min at 3000 
rpm at room temperature. Serum was coUected using a sterile pipette, divided into 1 ml aliquots, 
and stored at -80°G. Serum DNA from patients was purified as described previously (Grady et 
al., 2001, Gancer Res 61:900-902). 

Western Analysis. Approximately lO' cells were lysed in cell lysis buffer [50 mM 
Tris.HGl (pH 7.4)/l mM EGTA/1% Nonidet P-40/0.25% sodium deoxycholate/150 mM NaCl]. 
Equal amounts of protein were subjected to SDS polyacrylamide gel electrophoresis and then 
transferred to a PVDF nylon membrane (MUUpore), which was probed with 1 :200 dilution of 
mouse anti-V5 monoclonal antibody (Invitrogen). Immune complexes were visuahzed with 
EGL+Plus Western blotting detection kit (Amersham) after incubation with horseradish 
peroxidase-coupled secondary antibody (Santa Gruz). 

Sodium Bisulfite Treatment: Flanking PGR and MS-PCR. Sodium bisulfite treatment to 
convert unmetiiylated cytosine to thymidine was performed similarly as described (Grady et al., 
2001, Gancer Res 61 :900-902). Primers that flank the SLG5A8 exon 1 CpG island are 5'- 
GGTGAA GGTAAA GATGTT AAAAATG-3' and 5'-AGAAGT AAAAAG TGGAAT 
TGTGATG-3'. PGR were carried out by using a hot start at 95'G (7 min) and following cycling 
parameters: 35 cycles of 95-C (45s), 56-G (45s), 72-G (45s), 72-G (10 min), and 4-Gto cool. 
Primers to amplify the methylated aUele are AS-meth-442-459s: 5'-TGGAAG GTATTT 



-85- 



wo 03/104427 



PCTAJS03/18239 



CGAGGC-3' and AS-meth-550as: 5'-ACAACG AATCGATTTTCCG-S'. PGR parameters are 
31 cycles of 95-C (45s), 56'C (45s), 72'C (45s), 7TC (10 min), and 4^0 to cooL Primers to 
amplify the unmethylated allele are AS-umneth-442s: 5'-TTGAAT GTATTT TGAGGTG-S' 
and AS-immeth-542as: 5*.TCAATT TTCCAA AATCCC-3'. PGR parameters are 31 cycles of 
95*0 (45s), 46'C (45s), 7rC (45s), ITC (lOmin), and 4-C to cool. 

Methylation-Specific Real-time PGR. The same MS-PCR primers as above (As-meth- 
442-459S and As-meth-550as), were iSrst used to amplify a bisulfite converted methylated 
SLC5 A8 exon 1 template. A fluorogenic hybridization probe was designed using sequences 
specific for the sodium bisulfite converted SLC5A8 methylated template. The sequence was the 
following: 5'-6FAM-CAAGGACGAAT ACAAAAACG AGTACGAAC-BHQ.2-3\ Bisulfite 
converted sequences from the MYODl gene were used as an internal reference as described by 
(Usadel et al., 2002, Cancer Res 62: 371-375). Primers and probes fox MYODl were: forward 
primer 5'-CCAAGTCGA AATGCCGTC TGTAT-3^ reverse primer 5'-TGATTAATT TA 
GATTGGGTTT AGAGAAGGA-3'; and probe: 5'-6FAM-TCCCTTCCT ATTCCTAAA 
TGCAAC CTAAATACGTCC-BH-2-3'. All the above primers and probes were synthesized by 
hitegrated DNA Technologies, Inc. For the gene of mterest, SLC5A8, the reaction mix 
contained 600 nM primer, 200 nM probe, 5.5 mM-Mg^"", IX Supennix firom Bio-Rad. The total 
volume was 25 ^il. For the MYODl gene, the reaction mix contained 400 nM primer, 200 nM 
probe, 3 mM-Mg^"^, IX Supennix from Bio-Rad. The total volume was also 25 ]xl Themial 
cycling was initiated with 50^G for 2 rain, then 95**C for 10 min, followed by 55 cycles of 95**C 
for 1 5 sec and 60^C for 1 min. PGR was performed in separate wells for each probe/primer set. 
Each plate contained multiple positive controls, negative controls and water blanks. Golon 
cancer cell line RKO was used for a positive control, and V9M as a negative control. Serial 
dilutions of RKO DNA were used to create a standard curve. SLG5A8 methylation was 
determined as the ratio of SLG5A8:MY0D1=2 exp- (GTslcsas- CTmyodi). 

Aberrant Crypt Foci. Aberrant oypt foci (ACF) (Bird, 1987, Cancer Lett 37: 147-151; 
Pretlowetal., 1991, Cancer Res 51: 1564-1567; Siuetal., 1999, Cancer Res 59: 63-66) were 
isolated from grossly normal human colonic mucosa according to the method of Bird et al. (Bird 
et aL, 1997, Cancer Lett 116: 15-19). Strips of human colonic mucosa, stored over Uquid 
nitrogen, were thawed rapidly in 1% paraformaldehyde and fixed flat in 70% ethanol for 30 min 
at 4*C (Bird et al., 1997, Cancer Lett 1 16: 15-19). The colonic strips were stained for 2 min in 
0.2% methylene blue (Chroma-Gesellschaft Schmid & Co, distributed by Roboz Surgical 
Co, Washington, DC) in O.l M sodium phosphate buJBfer (pH 7.4), rinsed in 1% 
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parafonnaldehyde for 15 min, transferred mucosal side up to a glass slide and viewed at 30X 
magnification under a dissecting microscope. The ACF were teased from the mucosa with 
microdissection forceps (FWR #55 Dumont Bio Inox Forceps, 0.05 x 0.02 mm tips), placed in 
microfiige tubes, and stored over liquid nitrogen. The control for each ACF was a similar 
number of microscopically normal crypts teased from the same mucosa. 

Cell Culture and Clonogenic Assays. Vaco cell Unes were cultured as previously 
described (Veigl et al, 1998, Proc Natl Acad Sci U S A 95: 8698-8702; Markowitz et al., 1995, 
Science 268: 1336-1338; WiUson et al., 1987, Cancer Res 47: 2704-2713). FET and RKO were 
the kind gift of Dr. M. Brattain (Roswell Cancer Institute, Buffalo, NY). Colony formation 
assays were performed as described (Moinova et al., 2002, Proc Natl Acad Sci USA 99: 4562- 
4567). Briefly, colon cancer cells were plated on a rat tail collagen matrix (Willson et al., 1987, 
Cancer Res 47: 2704-2713) (which was found necessary for proper membrane localization of 
SLC5A8 protein). Cells were then transfected with either a SLC5A8 expression vector or a 
control empty vector, and the number of stable colonies arising after selection in G418 was 
respectively counted. 

5-Azacytidine Treatment. The treatment was performed as described previously (Veigl 
et al., 1998, Proc Natl Acad Sci U S A 95: 8698-8702). Briefly, cells were treated for 24 h on 
day 2 and day 5 with 5-azacytidine (Sigma) at 1 .5 ]ig/ml The medium was changed 24 h after 
addition of the 5-azacytidine (i.e., on day 3 and day 6). 

Statistical Methods. Association of SLC5A8 methylation with sex was analyzed by 
using two-tailed Fishers' exact tests. Association of SLC5 A8 methylation status with tumor site 
or stage was analyzed by using Pearson's xJ^ statistics. Comparisons of age distributions based 
on SLC5 A8 methylation were done by using Wilcoxon nonparametric tests. Comparisons of 
colony counts after transfection with different vectors were done by / tests and linear models. 

Bsp2 site assays. (1) For 4 Hpa2 site assays, the following primers were used: 5'- 
CCAGCGAAGGCGTAGTAGAT-3' (3D41-Hpa2-190R) and 5'-GGCTCCAGTTCTCA 
TCTGCT-3' (3D41-Hpa2-633F). The Advantage-GC-genomic DNA polymerase kit was used. 
Themial cycling was performed at 95^C for 1 min, 95»C for 45 sec, 63^C for 45, 72*»C for 90 sec, 
then followed by 26 cycles, and finally 72*'C for 5 min. (2) For 6 Hpa2 site assays, the followmg 
primers were used: 5*-CCAGCGAAGGCGTAGTAGAT-3' (3D41-Hpa2-190R) and 5'-. 
GGCAGTCTAAAAACTCCAGGC-3' (3D41-Hpa2-82430F). The Advantage-GC-genomic 
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DNA polymerase kit was used. Thermal cycling was perfonned at 95'C tor 7 nun, y d'C lor « 
sec, 64«C for 45, ll'C for 90 sec, then followed by 29 cycles, and finally 72'C for 5 min. In 
both assays, aberrant methylation of colon cancer cells is indicated by recovery of a PGR 
product ftom DNA that has been digested with the restriction enzyme Hpa2. 

Incorporation by Reference 

All pubUcations and patents mentioned herein are hereby incorporated by reference in 
their entirety as if each individual pubUcation or patent was specifically and individually 
indicated to be incorporated by reference. In case of conflict, the present ^plication, including 
any definitions herein, will control. 

Equivalents 

While specific embodiments of the subject ii^vention have been discussed, the above 
specification is illustrative and not restrictive. Many variations of the invention will become 
apparent to those skilled in the art upon review of this specification and the claims below. The 
full scope of the invaition should be determined by reference to the claims, along with their fiill 
scope of equivalents, and the specification, along with such variations. 
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We claim: 

1. An isolated polypeptide comprising an amino acid sequence selected from the groiq) 
consisting of: 

a) an amino acid sequence at least 95% identical to SEQ ID NO: 1 ; and 

b) an amino acid sequence encoded by a nucleic acid that hybridizes under high 
stringency conditions to a nucleic acid of any one of SEQ ID NOs: 3 or 4, 
wherein said polypeptide is a cell surface protein. 

2. The isolated polypeptide of claim 1, wherein the polypeptide comprises a transmembrane 
domain as set forth in any one of SEQ ID NOs: 19-31. 

3 An isolated antibody, or fragment thereof, which is specifically unmunoreactive with an 
epitope of an amino acid sequence as set forth in SEQ ID NO: 1. 

4. The antibody of claim 3, wherein said antibody is selected from the group consisting of: 
a polyclonal antibody, a monoclonal antibody, an Fab fragment and a single chain 
antibody. 

5. The antibody of claim 3, wherein said antibody is labeled with a detectable label. 

6. An isolated nucleic acid selected from the group consisting of: 

a) a nucleic acid comprising the nucleotide sequence of SEQ ID NO: 2, or a conq)lement 
thereof; 

b) a nucleic acid molecule that encodes a polypeptide comprising the amino acid 
sequence at least 95% identical to the amino acid sequence of SEQ ED NO: 7; and 

c) a nucleic acid molecule that hybridizes under stringent conditions to SEQ ID NO: 2. 

7. The nucleic acid of claim 6, further comprising a vector nucleic acid sequence. 

8. A host cell which contains the nucleic acid of claim 6. 

9. A method for producing the polypeptide of claim 1 , comprising culturing tiie host cell of 
claim 5 under conditions in which the nucleic acid molecule is expressed. 
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10. A method for detecting the presence of the polypeptide of claim 1 in a sample, 
comprising: a) contacting the sample with an antibody which selectively binds to the 
polypeptide of claim 1 ; and b) determining whether the antibody binds to the polypeptide 
in the sample. 

11. A kit for detecting a human SCL5 A8 polypeptide comprising: (i) an antibody of claim 3; 
and (ii) a detectable label for detecting said antibody, 

12. A method for detecting the presence of the nucleic acid of claim 6 in a sample, 
comprising: 

a) contacting the sample with the probe or primer of claim 6; and 

b) detemiining whether the probe or primer binds to a nucleic acid in the sample. 

13. A kit comprising the probe or prima* of claim 6 and instructions for use. 

14. A method for identifying a compoimd which binds to the polypq)tide of claim 1 , 
comprishig: 

a) contacting the polypeptide, or a cell e:q)ressing the polypeptide of claim 1, with a test 
compound; and 

b) detemiining whether the polypeptide binds to the test compound. 

15. A method for modulating the activity of the polypeptide of clann 1, comprising 
contacting the polypeptide or a cell expressing the polypeptide of claim 1 with a 
compound which binds to the polypeptide in a sufBcient concentration to modulate the 
activity of the polypeptide. 

1 6. A method of inhibiting abeirant activity of a SLC5 A8-expressing cell, comprising 
contactmg the cell with a compound that modulates die activity or expression of the 
polypeptide of claim 1 , in an amount which is effective to reduce or inhibit the aberrant 
activity of the cell. 

17. The method of any of claims 14-16, wherein the compound is selected from the group 
consisting of a pqptide, a phosphopeptide, a small organic molecule, an antibody, and a 
peptidomimetic. 
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18. The method of any of claims 14-17, wherein the cell is found in the colon, kidney, lung, 
esophagus, small bowel, stomach, thyroid, uterus, and breast 

19. A method of treating or preventing a disorder characterized by aberrant activity of a 
SLC5A8-expressing cell, in a subject, comprising administering to the subject an 
effective amount of a compound that modulates the activity or expression of the 
polypeptide of claim 1, such that the aberrant activity of the SLC5 A8-expressing cell is 
reduced or inhibited. 

20. A transgenic mouse having germline and somatic cells comprising a chromosomally 
incorporated transgene that disrupts the genomic SLC5A8 gene and inhibits expression 
of said gene, wherein said disruption comprises insertion of a selectable marker sequence 
resulting in said transgenic mouse exhibiting increased susceptibility to the formation of 
tumors as compared to the wildtype mouse. 

21 . The transgenic mouse of claim 20, wherein said mouse is homozygous for said 
disruption. 

22. The transgenic mouse of claim 20, wherein said mouse is heterozygous for said 
disruption. 

23. A transgenic mouse having germline and somatic cells in which at least one allele of a 
genomic SLC5A8 gene is disrupted by a chromosomally incorporated transgene, which 
transgene inhibits the expression of said genomic SLC5A8 gene, wherein (i) said 
genomic SLC5A8 gene encodes a SLC5 A8 protein; and (ii) said disruption comprises 
insertion of a selectable marker sequence, which replaces all or a portion of the genomic 
SLC5A8 gene or is inserted into the coding sequence of said genomic SLC5A8 gene; 
and (iii) said transgenic mouse has increased susceptibility to the development of 
neoplasms. 

24- Isolated mammalian cells comprising a diploid genome including a chromosomally 

incorporated transgene, which transgene disrupts the genomic SLC5 A8 gene and inhibits 
expression of said gene. 

J cells of claim 24, which cells are mouse cells. 
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26. A method for generating a mouse and mouse embryonic stem cells having a functionally 
disrupted endogenous SLC5A8 gene, comprising the steps of: 

(i) constructing a transgene construct including (a) a recombination region having all or a 
portion of the endogenous SLC5A8 gene, which recombination region directs 
recombination of tiie trans^e with the endogenous SLC5A8 gene; and (b) a marker 
sequence which provides a detectable signal for identiftdng the presence of the transgene 
in a cell; 

(ii) transfening the transgene into embryonic stem cells of a mouse; 

(iii) selecting embryonic stem cells having a correctly targeted homologous 
recombination between the transgene and the SLC5A8 gene; 

(iv) transferring said cells identified in step (iii) into a mouse blastocyst and implanting 
the resulting chimeric blastocyst into a female mouse; and 

(v) selecting offspring harboring an endogenous SLC5 A8 gene allele comprising the 
. correctly targeted recombination. 

27. A method of evaluating the carcinogenic potential of an agent comprising: (i) contacting 
the transgenic mouse of claim 20 with a test agent; and (ii) comparing the number of 
transformed cells in a sample from the treated mouse with the number of transformed 
cells in a sample from an untreated transgenic mouse or transgenic mouse treated with a 
control agent, wherein the diflference in the number of transformed cells in the treated 
mouse, relative to the number of transformed cells in the absence of treatment or 
treatment with a control agent, indicates the carcinogenic potential of the test compoimd. 

28. A method of evaluating an anti-proliferative activity of a test compound, comprising: 

(i) providing a transgenic mouse of claim 20 having gennline and somatic cells in which 
the expression of the SLC5A8 gene is inhibited by said chromosomally incorporated 
transgene, or a sample of cells derived therefrom; 

(ii) contacting the transgenic mouse or the sample of cells with a test agent; and 

(iii) determining the number of transformed cells in a specimen from the transgenic 
mouse or in the sample of cells, 

wherein a statistically significant decrease in the number of transformed cells, relative to 
the number of transformed cells in the absence of the test agent, indicates the test 
compound is a potential anti-prohferative agent. 
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29. A method for detecting differential methylation patterns m a 6iAS>Aii nucieouae 
sequence, comprising: 

a) obtaining a sample firom a patient; 

b) assaying said sample for the presence of methylation within a nucleotide 
sequence as set forfh in any one of SEQ ID NOs: 12-13 or fragments thereof; 

c) obtaining a sample from a healUiy subject; 

d) assaying for the presence of methylation in a nucleotide sequence as set forth in 

any one of SEQ ID NOs: 12-13 or fragments thereof; and 

e) comparing the methylation patterns in the sample from the patient to the 

methylation patterns in the normal sample. 

30. A method for detecting a SLC5 A8-associated cancer, comprising: 

a) obtaining a sample from a patient; and 

b) assaying said sample for the presence of methylation within a nucleotide 
sequence as set forth in any one of SEQ ID NOs: 12-13 or fragments thereof; 

wherein methylation of said nucleotide sequence is indicative of a SLC5A8-associated 
cancer. 

31. The method of any one of claims 29 and 30, wherein the sample is a bodily fluid selected 
from the group consisting of blood, serum, plasma, a blood-derived fraction, stool, urine, 
and a colonic effluent. 

32. The method of claim 3 1 , wherein the bodily fluid is obtained from a subject suspected of 
having or is known to have a SLC5A8-associated cancer. 

33. The method of claim 32, wherein said SLC5 A8-associated cancer is selected from the 
group consisting of: colon cancer, breast cancer, thyroid cancer, and stomach cancer. 

34. The method of any one of claims 29 and 30, comprising assaying for the presence of 
methylation within the SLC5A8 sequence as set forth in SEQ ID NO: 14. 

35. The method of any of claims 29-34, wherein the assay is methylation-specific PGR. 

36. The method of claim 35, comprising: 
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a) treating DNA from t he sample w ith a c ompound t hat c onverts n on-methylated 

cytosine bases in the DNA to a different base; 

b) amplifying a region of the compound converted SLC5A8 nucleotide sequence 

with a forward primer and a reverse primer; and 

c) analyzing the methylation patterns of said SLCS AS nucleotide sequences. 

37. The method of claim 35, comprising: 

a) treating DNA from t he sample w ith a c ompound t hat c onverts n on-methylated 

cytosine bases in the DNA to a different base; 

b) ampUfying a region of the compound converted SLC5A8 nucleotide sequence 

with a forward primer and a reverse primer; and 

c) detecting the presence and/or amount of the amplified product. 

38. The method of claim 35, wherein the forward primers are selected jfrom SEQ E) NOs: 8 
and 10. 

39. * The method of claim 35, wherein ttie reverse primers are selected from SEQ JD NOs: 9 

and 11. 

40. The method of claim 35, wherein the compound used to treat DNA is a bisulfite 
compound. 

41 . The method of any of claims 29-34, wherein the assay comprises using a methylation- 
specific restriction enzyme. 

42. The method of claim 41, wherein said metfaylation-specific restriction enzyme is selected 
from Hpan, Smal, SacII, EagI, Mspl, BstUI, and BssHIL 

43. The method of claim 41, further comprising a pair of primers selected from SEQ ID 
NOs: 5-7. 

44. A method for detecting a SLC5A8-associated cancer in a subject, comprising detecting 
SLC5A8 protein or nucleic acid expression in a sample from the subject. 
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45. The method of claim 44, wherein the sample is a bodily fluid selected from the group 
consisting of blood, serum, plasma, a blood-derived fraction, stool, urine, and a colonic 
efQuent. 

46. The method of claim 45, wherein the bodily fluid is from a subject suspected of having 
or known to have a SLC5 A8-associated cancer. 

47. The method of claim 46, wherein the SLC5A8-associated cancer is selected from the 
group consisting of: colon cancer, breast cancer, thyroid cancer, and stomach cancer. 

48. The method of claim 44, wherein the SLC5A8 protein is detected by immunoassays. 

49. A method for identifying an agent which enhances SLC5A8 protehi or nucleic acid 
expression in a diseased cell associated with SLC5A8 gene silencing, comprising: 

a) contacting the cell with a sufficient amount of the agent under suitable 
conditions; 

b) quantitatively determining the amount of SLC5A8 protein or nucleic acid; and 

c) comparing the amount of SLC5 A8 protein or nucleic acid with the amount of 

SLC5A8 protein or nucleic add in the absence of the agent, 
wherein a greater amount of SLC5A8 protein or nucleic acid in the presence of the agent 
than in the absence of the agent indicates that the agent enhances SLC5A8 protein or 
nucleic acid expression. 

50. The method of claim 49, wherein said SLC5A8 gene silencing is due to differential 
methylation of a SLC5A8 nucleotide sequence. 

51. The method of claim 50, wherein differential methylation occurs within a SLC5A8 
nucleotide sequence set forth in any one of SEQ ID NOs: 12-13 or fragments therwf. 

52. The method of claim 49, wherein the diseased cell is from a subject having colon 
neoplasia. 

53. A method for monitoring over time a SLC5A8-associated cancer comprising: 

a) detecting the methylation status of a SLC5A8 nucleotide sequence in a sample 
from the subject for a first time; and 
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b) detecting the methylation status of the SLC5A8 nucleotide sequence in a sample 

from the same subj ect at a later time; 
wherein absence of methylation in the SLC5A8 nucleotide sequence taken at a later time 
and the presence of methylation in the SLC5 A8 nucleotide sequence taken at the firet 
time is indicative of cancer regression; 

wherein presence of methylation in the SLC5A8 nucleotide sequence taken at a later 
time and the absence of methylation in the SLC5A8 nucleotide sequence taken at the iBrst 
time is indicative of cancer progression. 

54. The method of claim 53, wherein the sample is a bodily fluid selected from the group 
consisting of blood, serum, plasma, a blood-derived fraction, stool, urine, and a colonic 
effluent 

55. The method of claim 53, wherein the SLC5 A8-associated cancer is selected from the 
group consisting of: colon cancer, breast cancer, thyroid cancer, and stomach cancer. 

56. A method for treating a SLC5A8-associated proliferative disease in a subject, comprising 
administering to the subject a sufficient amount of a compound, wherein the compound 
modulates the SLC5A8 protein or nucleic acid expression. 

57. The method of claim 56, wherein the disease is associated with methylation of a 
SLC5A8 nucleic acid sequence, and the compound induces SLC5A8 expression. 

58. The method of claim 57, the compound is a demefhylation agent selected Scorn 5- 
azacytidine and 5-deoxy-azacytidine. 

59. The method of claim 56, wherein the SLC5 A8-associated proliferative disease is selected 
from the group consisting of: thyroid nodular hyperplasia, thyroid adenoma, thyroid 
cancer, colon neoplasia, breast cancer, and stomach cancer. 

60. A method for treating a SLC5A8-associated cancer in a subject, comprising 
administering to the subject a vector containing a SLC5A8 nucleic acid which is 
operably linked to a heterologous promoter. 
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61 . The method of claim 60, wherein the SLC5 A8 nucleic add encodes a polypeptide at 
least 90% identical to SEQ ID NO: 1. 

62. The method of claim 60, wherein the cancer is a colon neoplasia. 

63 . A bisulfite-converted methylated SLC5 A8 nucleotide sequence selected firom the group 
consisting of: 

a) a nucleotide sequence of any one of SEQ ID NOs: 15-18 or a fragment 

thereof; 

b) a complement of any one of SEQ ID NOs: 1 5-1 8; and 

c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 

sequence of any one of SEQ ID NOs: 15-1 8. 

64. OUgonucleotide primers for detecting methylation of a SLC5 A8 nucleotide sequence, 
selected from SEQ ID NOs: 5-11. 

65. A kit for detecting a SLC5A8-associated cancer in a subject, comprising at least two 
primers of claim 64. 

66. The kit of claim 65, further comprising a compound to convert a template DNA. 

67. The kit of claim 66, wherein the compound is bisulfite. 

68 . The kit of claim 67, wherein each primer comprises at least a CpG dinucleotide. 

69. A method of converting a nucleic acid sequence at least 95% identical to any one of SEQ 
ID NOs: 12-13 or fragments thereof, to a bisulfite converted sequence comprising: 

a) providing a nucleotide acid having a nucleotide sequence, as set forth in any one 

of SEQ ID NOs: 12-13 or fragments thereof; and 

b) adding a bisulfite compound, 

whereby the unmethylated cytosine bases of the CpG islands are converted to a different 
base. 

70. The method of claim 69, wherein the unmethylated cytosme is converted to a uracil. 
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72. An isolated or recombinant methylated SLC5A8 nucleic acid, comprising a nucleotide 
sequence as set forth in any one of SEQ ID NOs: 12-13 or fragments thereof, wherein the 
cytosine of the CpG island is methylated. 

73. An isolated or recombinant SLC5A8 nucleic acid, selected from the group consisting of: 

a) a nucleotide sequence as set forth in any one of SEQ ED NOs: 12-13 or a 
fragment thereof; 

b) a conq)lement of any one of SEQ ID NOs: 12-13; 

c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 

sequence of any one of SEQ ID NOs: 12-13; 

d) a nucleotide sequence that is at least 98% identical to the nucleotide sequence of 

any one of SEQ ID NOs: 12-13; and 

e) a nucleotide sequence comprising at least 50 consecutive base pairs of any one of 

SEQ ID NOs: 12-13, 

wherein the SLC5A8 nucleotide sequence is differentially methylated in a SLC5A8- 
associated disease cell. 

74. A method for detecting colon cancer, comprising: 

a) obtaining a san:5)le from a patient; and 

b) assaying said sample for the presence of methylation of nucleotide sequences 

within at least two genes selected from the group consisting of: SLC5A8, 
HLTF, pl6,andhMLHl; 

wherein methylation of nucleotide sequences within the two genes is indicative of colon 
cancer. 

75. The method of claim 74, wherem the sample is a bodily fluid selected from the group 
consisting of blood, serum, plasma, a blood-derived fraction, stool, urine, and a colonic 
eflfluent 



76. 



The method of claim 74, wherein the bodily fluid is obtained from a subject suspected of 
having or is known to have colon cancer. 
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77. A kit for detecting colon cancer in a subject, comprising primers for detecting 

mefhylation of nucleotide sequence within at least two genes selected .from tibie group 
consisting of: SLC5A8, HLTF, pl6, and hMLHl, 

wherein the primers for detecting methylation of SLC5A8 nucleotide sequence are 
selected from SEQ JD NOs: 5-11; 

wherein the primers for detecting methylation of HLTF nucleotide sequence are selected 
from 5'-TGGGGTTTCGTGGTTTTTTCGCGC-3', 5'- 
CCGCGAATCCAATCAAACGTCGACG-3', 5'- 



ATCACCACAAATCCAATCAAACATCAACA-3', 5'- 
GCACGACTAAAAAATAAATCGCCGCG-3', 5'- 
AAACACACAACTAAAAAATAAATCACCACA-3', 5'- 
TAAAACCTCGTAACTTTCCCGCGCG-3% 5'- 
GTCGCGAGTTTAGTTAGACGTCGAC-3', 5'- 
TCCTAAAACCTCATAACTTTCCCACACA-3', 5'- 
AGTTGTTGTGAGTTTAGTTAGATGTTGAT-3' 

wherein flie primers for detecting me&ylation of hMLHl nucleotide sequence are 
selected from 5'AACGAATTAATAGGAAGAGCGGATAGCG-3', 5'- 
CGTCCCTCCCTAAAACGACTACTACCC-3', 5'- 
CGTTTTTTTTTGAAGCGGTTATTGTTTGT-3', and 5'- 
AACGAACCAATAAAAAAAACAAACAACG-3' 

78. Hie kit of claim 77, further comprising a compound to convert a template DNA. 

79. The kit of claim 78, wherein the compound is bisulfite. 
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NCBIGeiffiankAicbssion Number AC06395r • • 

Human Genomic Sequence that covers 15 exons of the Huil Gene and the proinoter^ 
regjo£'. Base PaiK 322Q0-8326.7 are highUghted in.yeUow on page 

GAATTCnrrCAGGGTCCTCCACTAATGCAGGATAAAACCCAACCrAAC^^ 

AAAG€CAAGCCA;GAGCTTTGCTGATAtCATGTCATATCATGT.GTCCATGCGCT 

CAGACTGCAAACATTTATGAACACACGTCTTTTCTGTTCCCTCTCTTAGTAm 

TTAACCCCGCTTCTGACTGGCAGACtGCTTITCAACCTTCAAGGACTATC " 

• AATACAAGTCCTTrGAAAAGCTTTCTCTGTTCCrCAACCCCAGGCAGAAT^ 

■•TA<5TTATAATTCAATATATCTCCAACAGACAAATTCGCTAGACfATAACTA 

CGGTTCCAGGCTCCATTTCTCCAACAACTATCAAAACCTGCAGGCATCACTGT 

TTACCCTGGAATAGCCAGCTCATACCCAGCACAGAGTAAGCTCTACTGCTTAC 

TGCAATATGTGCTTACATATATTGCAGAATATGTATAGATTTAGCTTTTGCCA 

cactttattttgcttattt toatataaattaagcxtttat^^^ . 

gtitatttacarractgcatgcaattcccatttccttaacaagattataaattc 

otgagggcagagccitrgtcaca(ntcacttgtattccrcagaaggttact^ 

aatgctaaacatattattaacrgctcaattaatatttgttgaagtaaattact 

tmgtatggttrctttgatgagggtctccggcaaggaaacattttcacccag- 

gatcaaagcagaaattacatccaagagggtctcccagcaagatgtggagag 

actggctacitgcagtgtttgagataacacctgcaaaaagaaaaaacaatgc 

atactcagcaatcagcttttaaaaaccagactctcgagcatattaagttggg " 

aactcaccttacagacatcagcaggcgtgggtatctttgtcccacttccatgt 

titacaagtataaggtatgtttccaacaacctttraatctgttcagaacm 

acaacagttagtttitgttacntitgtgfgtagatccaagagcgattcetgga 

aacaaagaatttaacagagagaatataaagtatcaaactcaactgttaatcc 

aacaataaaaactgtgggacagttcaacttitcccatrctcrtgctitggaag 

aaaAagatacatttaccaggactcactaacgataacagctaacattggctga- 

GTGCATTGTTCrAAGTCTilTlTililllTlTlTllTiTGAGATGGAGTTTCACT 
CTrATTGCCCAGGCTGGAGTGCAATGGCGCAATCTTTGCTCACCGCAACCTCT 
GCCTCCCAGGTTCAAGCGATTCTCCTGCCTCACCCTCCCAAGTAGCTGGGATT 
GCAGGCATGCGCCATrTrGTATTTTTAGTAGAGACA<K:GTTTCTCCATGTTGT 
TCAGGCTGGTeTCGAACTCCCAACCTCAGGTGATCCACCCACCTCGGCCTCCC. 
AAAGTGCTGGGATTACAGGTGTGAGCCACTGTGCCCGGCCCATTCTAAGTTTT 
. TrACAAGTATTCACTCATCATCCTGACAGCAACCCTGAGGGAGAGAATATTA 
CrACCCCCATtrATTATCTGAAGAGACTGGGGAATCGAGATITCAAATAATtT 
TCCCAGGTTACACtAGCAGTAAGTGGAAGGGTCAAGATTCAAACAGGCAGTC 
TGGCTCCAGAGCCTTCTACTGCATCTCAAGATAACATATCAAATAAAAAATA 
GCAeAGGGGGGCAGAGGGAAGGAAAATTTTAATATGTGTACAGAAGTATAA 
AtAAAACAATTATAAAATATAACTTCTAGGAGTGATTTGCAAAACTGAGAGC 
AGAAACAGTAAATrCTTAGACTCCTTATCAAAAACCCATCCJTAAAGATTAG 
AACTCACTTGCAAACATTCAAAAAATGTACCAAAATGTTCCTTGGAGATGTA 
GGATACAGTGGATTTGACCATGTTTTTGAGTGTTTCTCCAATTAACATCCATG 
GTAGTTGAGTTTCfGTTTCAGTGACTGGTCCTAGCTTTCGCAAAATTAGCTTC . 
ACTGCCTGTGGGAATACAAGTCATTAACATTAATATATAAACTCITATTTTCA. 
AGTATTGCTCTGGGAGTCACCACTCTGATTACATTAGCTACCTTAAAAATCAG 
GTCTGiGAATAGTCAAGGATATGGCAACTTTGATGGTTAATAACGTGTCCCAA 
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' AAGGAACrTGTAAACTAGGATGGGAATTTTATCATTACAACTGT^ ' 
ACCTGCCCTTGCCCATTAATCAATTTATCTTACCCCCTAGCAf CCAAAATAAA ■ 

• GAAGACATCAATGGCrCTGTAAGTAACATCTCACTGTCCAAGGCCTACATTG 

• GGTAAGCATTGTTACTGAACACCCAACACTCCTCATCTAGTGCTATGTACTCT 

. AGGGTGTCTTCCCTCCACATCATTTGCCTCATTCTTTTCTAC^ : 

• ••" CATCCACATAGACTCTAAGCACCCCTCCCCATTATGTACCTGGCCTGTAGAGG ' 
• AGTGAAACATATTTCTAACrCCTTTGCACATTTCAAAGAGCAACTGTCCAA 
•CCTTCAACTrrTTCTGGATGTTTATCAAGATCAAGAAACAT^ 

■ TGCGTTTTTATCAGAGACCTAAAATATi^GTAGAAATGGAA^ .. 
ACAAACATAAACATATCCAAATGAAGAACTCTCATTTGCTATTCTATATATTG 
TTATTTAAATATACTTAGGAAAAAATTGTGAAAAGAACAGACATTTGTAATG 

. ATTTCAAACTACrCTAATAGTGTGTTGTCACrrGGAATGTCACACAAAAGAGT 
GAATTCATTGGTATTACAGGAAGATATGTACTACAAAAAAAACTACAGAAGT 
CAACATTCTATCTTTtAAAGTAAATACATTTGTATTTTGTATTGTATACAGTAA 
AATATirGTATTTTATGAAATAAAAAATTTAATATTAATGTTm 
CAAAAACCCTCTAAAATTTACTATGAATATnTATACAAACACAAATGACAC 

• CATGCTACCAGTTACITCCAGGAAACTATTAGAAAAGTTCTTGTTAATGTCAC 
AGTTCATGTATITACTGACTTGAATTAGAGAGCATTTTTGATATATAGACACT 
GAAAGCTATCACAAACCTAAATAACAATACAATAGTGACCCAAATTAGATCA 
CATCAATTTGTGAGCAACAAATAAGTCTTACTGTGAGATAAATTCAAAATAG . 
TAACAGATTTCTGTTAAGTATATATATAGATACATAGAAAAAATTAGCCCAT 
AATAACCACATrTTTAAAAGATATCTACTTCTATTCATCAGATTATTATTATTA 
TrATTATTACTAGTGTGTGTGTGTATGTGTGTGTGTGTGTGATATGGAGTCTTG 
CTCTGTCGCCTAGGCTGGAGTGCAGTGGTGTGATCTCAGCTCACTGCAACTTC. 
CACCrCCTGGGCTCAAGCAATTCTCATGTCTCAGCCTCCCAAGTAGCTGAGGA 
TACAGGTGCGCGCCACCACGCCCAGCTAATTTTTGCATmTAGTAGAGACGG • 
GGTTTTGCCATGTTGGTCAGGCTGGTCTCCAATTGCTGACCTCAGGTGATCCA . 

■ CCTGCCrCGGCCTCCCAAGGTGTTGGGATTACAGGCATGAGCCACCACGCCT 
GGCCTATTCATCAGATTCTTGATGATTAGCAACAAACAGATAAAATACCAGA 

■ CTAACCrirCTCATCAAAAAAGTAAAACTTTCAGCAGCAAAATTTCTTATATG 
TAGTTTTTTAtGAGCCAGGAGTGTGCTGTACATGCTATACATGAAAAAAATA 
AGATACATTTCATTAATCATATAATTGTAATAAATACATACTACATGTCAACA 

• ■ ATATGGGCAACAATGTGCTGGGTATGCAAGGAATACACAGCAGGTATCAAAC 

A^TTTAAAATCTCATTCATTTATGGAGACACCCACATGTTGAAAGGAAGAC 

TTGACCACAGACATGAAGAGTCCTAGGACTGGTGGTACTGGTTTTACAAACA 

AGACTCCAGGAAAAGTTGAAATTTGTAATGAGCTCTGAATGAAAGAAGAATT 

AGGTGGGGACTGCAGTCCATATTATTGGTATAAAAGCAAGAGCAAAGATGAG 

GCAGGTGGAAATGATCATGGTCATGACAAGGAGGCTGGTCCATCTAAAAGAG 

GAAAGATGATACAGTAGAGGAGAGCAGCTATGGATAAAGTTGGTCAGGTAG 

• ACAGGTCTAGCTTACATCTGTATAAGCACTTACTCTGTGTTACGCCATTTAAT 
CAGeAGAATAACTCTATGGGATGGGTACTATTATAATCCTCCCATTCTACAGA 
TAATGAAAGTGAGGCAGAAAGCATAAGCAACTTGCTCAAGGTCAAGCAGCC 
ATGCATCTATAACTAAATAATTACTtATATATAATCACATTGTTAAATTTGGT 
CTCCCTAATGATAGAAGGGTATGGAATATATCTCTCCAATTTTCTCATAACCC 
CAGTACCTAATAGTTCCTTGCTGACAGCAGGTACTAATAAATGTrGGCTGAAT 
GAGAAATGAeCATTTTCAGAAAGACTAATTTGGCAGCAATATACAGGATAAA 



.P2 



2/107 



wo 03/104427 PCT/US03/18239 

ataaaggaggaaagaagagtctgctaattcagtcagaaacigfgctcaagtc ' 

atacaagcttgggctaacaggcatgaaagagactggaaggagaggcaaAat ■ . 

ggcaAgggatgaacccagtagacatttcaggagtgcccacaatgaagctga ■ 

agacctracagcggtccacagggccgtggatgatcrggcgccn"tactactgct 

tctctgacatcacttagctagtctcacccttatacaggctgctcggccaccta 

aaacttgtcccgggcatgtgcccaagacattctccccctgctaaaatgtaaac 

mgtgagaagaggattttttatctgttttatctgtgactgtatctcaagtgcc 

tgacatacagacagtgtccaataaatattgactgagcaaatgcatgaatgac 

.agaatcaagaggatgtggacacctetgaagacaactgggttagtgatgcctc- 

tgatattccaaaactcagagacaggaagaacgttagtattaatgacagaaaa 

igggcaaccaaacagggcc^ggcatacggcagacactc^^tacctatrtact " 

gaacacitgaatgaatgtacagataagaggagttgatttaacaggaaagtgt 

tgagttcagtlnrcagtaatacaagtgaaaatagccaggctactgaagttgtc 

ACACCGAAAGCAAATAAATTTAGAACTGAAGAGCCTGACTTAGGAGACATA "• 
AAAGTGGCAGCriTTATGACAGCTGAAGTCATAAAACTAAATCAAtTAT^ 
CCAAAGACCrTGTTGGTTiGTTTAATGAAATAATATTGGTTTCTAACATAT^ 
ACCAAAAGCTrCATAGGGAACTAGTAAAGGCTGGAAAAAGATTTCTTTCTCT 
TTCTTCATTCAAAGTTCCrAAAAGAGAACGGTGGGCTGAGTGAAACXSGGTGG 
CATAAAACAAAGTCTATTGTCCCATACCATGCATACACCATTTACTGTGGGGT 
. AAGGGGCTGATGACTGATATTAAACACTCCTGTGQCAATTATAGAACATGAA 
AAGAACCGGAAGACCAAGTGGAAAATCCCACCAAACCCCAGACTTAAACAG 
ATAATAAGACAATTrATGCrrACTTTAAACAAAAGGTTACTACAAAATTCATT 
CTCnTrCCTTGTCCACATTTACTCACAMTACTGAGTGTCTCrTTCAG(J^ 
AAGTATTAGTGGCTGGGGAAATGTCAGTGTTTAAAACAGGGCTCTTAACCTG 
TCTTAAACCTCCCTTATACCAGGTGTAGAGAdGACAGCAATACATTATAACC 
AAACAATTCTGGTAAAAGTTGTGTGACATAGGTGGTTAAAAAATGCCAAGAG 
GATAGAGAGTACAAAATTTTATTAATACTTCTGGCAAAGGTGCCTGTTTGTGG 
AGGCTGTAGGGACCGAGTCAGGCCTTCTTGGGTGGGTATACTTCCAGTAGCT 
GGAGAGGAGAGATGGTCATGGCAACTCAGACTGGCAGACTAGATTGTGGAe 
CACTTTGGCTGGAGAATCAACTTTGTACATGGAAACAATTAGGCACTGAGCT 

ggcaaagtggtttggagtcataaagaagccagggaaaaagaagcaggc aat . 

taaaaaacatactagagagagaaaggggttacctgtatatactggacatgtc 

cttcaccatcagtctccacaggtacttataaagatatgataacgaggtgaaa 

gcccattctaacaactctgtgtcctgagtctccaggatcgaggtgatagtcaa. 

aaaaaactctggaaagtgtgggtagaaatccatctgcagatctcgtgccaac 

tgtacaaccaaactggattaaaaaaagaagcaactgatcatgtggatttttt 

taaaaggctacaaatctacttaacgatagtaggtggcctatgatcataggat 

taattaaataaaaatacnrctaaatcaagtctgaaaaattataaaatrcttta 

aatcttaagatctcagcaaagaaatcaggcaagaagttacattccatcataa 

gttgtagccaactgttcctatagtgtcaaagaaaaatgagattatcttagtat 

ataacaaatgcgaaattataccccaagattccactttatggtaatgcacrrat 

tcctaacagaaaaggaaaatcctcchtgtttctaaattatgagcacctgatct 

gcggtatatgtagctttgaaacacaat(ntrcttaaaattcaggcaaatatag 

TTGTGAGCTTrCTCTCAAGATGCTACTTACTCCAAAAGGGGTTGATAGGCAAA 

ACrGTTCTTAAOTGCAGGTGAGTCTTCAAACTCTGAACTATCrCGTm 

GATACACCAACrGATTGAATGATTGGCATTTGTCAATAACTTCnTrGTAAA^ 
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TrrCCTGAGTGGGKKjAAAAAAAAATATGTTCTTTTCTATCACACCACT^ 

GC'TGTATGGAACATATAAAGTGCAAACTATGGGTGAATAAAATAGAAACGA ' • 

AATAGAAAATACCGAAGTGTTCTGTGAGGTTTAATTCTCTCCATTTCAGCAGA 

CCCTCAAAAAAGTAGGTTTCAACCTCCTGTCAAGATtAAAGCAGTCCATTAAT 

CAAATAACTCTAAGAAAGCAATGATCAAGGTAAATAGAATAATGATATATTA 

ATATOTAAGACAATGATACTGAAGAGTTTAAGAACTCCTTTTAGTTm 

AACAGGCCATTAAGTAACACACTAAAAAGCAATCAGAAGTTATCAATAGGCC ' 

ACTTATAAAATGCTGmTGTTGATTTGGTTTCAGCAAATAATmCT^ 

CTATATATATTGTCCACTAAGGGAA^ATTAtTTTCTGTGTTITrTTATAA;!^ 

AGGCAAAGAATAAGGTGTTGGTTCTTTACAGTTCOTCATTCTGCTmAG^ 

ATATGAATTACTACAACCTTATAAAGAAGTAATATGGCATTCCTGTTAAAATT 

CAAAATAGTTACGCTCTTTGACCCAGGTAiaTTCTTATAAAGTTTGCAAGCCTT 

TAAGTAAAAGATGTTTATTCAGCTTAACTACAGTQTGGGGAAACATTAAACA' 

ggctaaaatatccgacaataggaaaatggttggaatagcccatggtctatat 
atactaaggtatattatgtggctactaaaaagagctatatctctaitgaacta 
gaactagactgaagatatgcccccaaAatgataaAagttatgcagtgagagg . 
gtatggcaggattcaarittccttaaaaaacaaaaataaaa^cccrctata 

AATGTTTGTACATACGAACATGAGAGGACAGTATGGAAGAATACAACTATAC . 
TGAACTGTCAGTGTTGGCTTCCTTGGGAATTAGGGGTGGGAGGAGTGAGATA 
ATGGGCTTTTCGTAAGTTtACCACrATGTTACTTAACTTGTTAACATGGTATC ' 
AATTACTTTGTACTTTGAAAGGTAAAGCCAATAAA'rTATCATACATGTATGAC 
ATGTATGTATACATGTACGTATGTGTGTATATAGATGTATAAATAAATGCATA 
CCCCCCCAAAACAATCATTCTACCAAAAAGATACATGCCTTCGTATGTTTATC' 
TCtACACTATCCACAATAGCAAAGACATTGAATTGTCCCAGGTGTCCATCAAC 
AGTAGATT.GGATAAAGAAAATATAGTACATATACACCATGGAATATTATATG 
CCTtAAAAAAGAATGAAATCGTGTCCTTTGCAGCAACATGGATACAGCTGGA 
GTCCATAATACTAAGCAAATTAACATAGGAACAGAAAACCAAAAACTGCATG 
TTCTGACTTATAAGTGGGAGCTAAATATTGAGTACACATAGACATAAATATA 
• AGAACAATGAACACTGTGGATTACTAGAGGGTGGAGGAAGTGGGGATGGGT 
TAAAAAAAACTACCTGTGGGGTACTATGCTCACTACCTGGGTGACAGGATCC. 
ATACTCCAAACCTCAGCATCACACAATATTCCCATGTAACAAACCTGCACAG- 
GTACCCCCrGTATCTAAAATCAAAGTTGAAATAAAAATAAAAATAAATACAA 
ATATGTGTTTATAGAGAGAGAGAAAAAAGAGAGAATAAACACATAAGCACA 
CATGCAAACAGCATGCCAAATCTACAATATCAAAAAAAAAAATCCTTAAACT 
GTTCmGGAAATCTTTAAAATCAATAGTTAGGCAGAATAGATACTATGTAAC 
CACAAATATTAAAAACTAAAAATTAAAAAAAAAGGCAGAAAAGAA^^GAGAA 
TCCCATTAAATTTTGTTTrAGGCTGGGGACAATGGCTCATGCCTGTAATCCCA 
ACAGTGGGAGGCrGAGGCAGGAAGAGCATTTGAGCCCAGGAGTTTGAGACT. 
AGGCrAGGCAACAACACGAGATCCTATCTCTATTTTTCAAAAGTAAAAATATT 
AAAATTTTTTTGTTGTTGTTTTCATGTCCriTAAGGCATTTTCATGTCC 
GGCAGAAAGAAAATATGCAACACAGITTAAAACTTAAATGCAGAACGCATTT 
CrAGCCTAGGACAGACCTGGCGTATGTCAGCTATGTGTGAGAGACCATGTCA 
CGTCCTTTTGCAAGGTAACTCTGGAGCTCCTTTCACCAAGAGACGGAGTCTGT 
TTCCTCACCCTTTGAATCTGGCCTGGCCTCCTGACTTGCTTTGACTAATAACAT 
GCGATGAAAGIGACTCTCATGACCAAAGCAAAACTTTATGAGGTCTTGACAG 
OTCTGCCTTCACTCTATTGAAAAGATrCTGCCACCATGAAAAGAAGCCTAGT 
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ctagcitatrgg<k}aataagaggccatgaaaagaacgaagatcaaca^ • 

acagccaactgccagacacgtgacagaggccattctggaccatccagtccag 

gattctaatttcaggctgtcttctctgccttcttaacactgcctcttgatacct.- 

tcaacctctcctgtggcttaaattattaatcatattctaaatgccaaatctgt 

Mtttc'acctctgaagtcAcctatactccagatccatatatacaaagatcttt 

■tggacactatcatttggatggccaaaggta.tttcaaattcaagccccaaatg 

gaactaattatcttatciirtacaaccttgtgctcctrtacctataaatatatct . 

tgatgagtaacatcctcaaccattcagccttaaccn:catratcttccttgctg. 

caaacccagtaggtcacaagtcctatccattcattctacttcctcaacatctc' 

tggaatctttctcttctctctatctxgattgctgctaccataatcatitcttac' 

tragataacagtcaaatcctcacaactggacaactgagattggatctccaag 

TTAAAGGCATGITACACAGGACCTTAAAAGGCTAGCAGAGGAATCATGACAT 

gatgaagaactgcagtcctgctcntttctccacactaaagccagagagattg 
aaaatfraaacctgatcatgtcattcattcccctgcttacagtcctagaactc ' 
taagtccctacagcttcaggataaaaccccaacttagcttcacatatgaagtc 
ctn"atgatccttgcctatttctccagctitatctcaccagttgcccccttgcc 
ccatcccagtcaccatgaatacaaaaaaaaaacaccctatacacacctataa 
ccacactggactgctttgcagtitotgaatgtgacatggaitctctagcttct 
gttccntratataggcttctcccrcagcctggaacaggctttcctgtcctcttt • 
accatgctaaaacctatataaacaaaattcaggtgtcacctcgtctagatca 
ccgtttttitgacacccctaaagaatgtaggttaggttccctttactatgagt 
;tctcaaagcacaatgtgctcacctagacgacatcatttaacacattgaattta 
cacagtggagataatgcattcaaagacccagcgagtggagtctgattatagt 
aaagaagaatattacgaactgagattggatcgccaaggtataggcatgttac 
gtacgaccttaaaatgctaggaggaataacgacatgatgcagaaaaagactt 
ctgcataaaaggatgacaaggccgggcgtgijtggctcacacctgtaatccca 
gcactttcggaggccaaggcgagcagatcacgaggtcaggagattgagacc 
atcctggccaacatgtcaaaaccccgtctctactaaaaatgcaaaaattagc 
tgggtgtggtggcacgtacctgtaatccca:gctactcgggaggctgaggcag 
gagaatcgcttaaacctgggaggtggagattgcagtgagccaagatcacgcc 
actgcactccaacctggagacggagcgagactccatctcaaaaaaatgaata 
aataaataaaagtaaaataaagggatgacagagtcagagtatggttaaaac 
tg<aaaaatattatgtagtctagcccctttatttttataaaggagaaaat^ 

CCCCGGGAAAAGGGCTTCCTCAAAATCACTTTAAAGTTATAGCTTCAGGAAT 
ATGGATCTGCAGCAGTGCTTGGAATGCATAAGGGAAAGGGAGAGGCTAGAA 
TCACAAAGACAGCTGAAAGTCAAGTCAATTGTCTAATAGAGCTTCACCCAAC 
AGAACTTTCTGCAAAGAtGAAAATGTTCCAATTCTATATTTATTCAATATGTT 
AGCAACTAGCCACATTTGGGCACCCAAATTmAATTTACTTTAAATCCAATT 
TGTCATATGTGGCTACTGTATGAAACAGCACAGGTCTAAAGCATTTCATGTCC 
AAAAAGGAATACCTTGAAAAACAATTCACTTCTACTAACAGAAGAAACTAAA 
ACACCATGAACACTTGAAGATTGACTAGTATCACAtTCTCTTACCTCCTCATA 
GCTTGCAGTTCTATCAATCCGGTGAATAATATCAATATTAACATTCCCCAGTC 
•GTTCAGCAAATGTAAGAAACTGGAAGGGAAAGTATATTTAAGATACATAATT 
AATTAAAATTTATCAGATCTTTAATATCTATTTGAATGCTGCATGTAGGCATC 
TCTAATCACAAAGGATAAGTGGAAAAATAAACTGAAAAACATACGGCCGTA 
AACAAAtTTACTGCATCACTGTTCAAAGATAATGAATACTTCTATGTTTGCAT 
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aatitcrctcagctatgtcatttcaaataaaatttccattgccaga 
gcctaggtggatgctggcaattagtctcgctagatctattaggtttcataccc 
tcccataagcatggggacctagcaaagtcgctgcaataaaagtgtttttaaa. 
catatacagagctatgattgtatcctaaggaagacctggaaa.caatctatca- • 

AGGGGCAAACAGAGAAAGCGCTGTATATTTGCCCTTAGCTGGGAATCACTCA 

ccgccagccgactcgcccaattcggtctttaaagataaaagagcaggggaga 

attggtccraagcaatctcctggaatagtgaatttaattctggactacaggaa- 

attcccaggactggccagaccecataaaacatgggtgaaacttgctgtccac 

actt'ctttcctcctccaacccatgtttactacatccagtgtctccctc^ 

cggagcctccaggaaagtgacacactcggcccagaagtctgagiacccctgga 

gtctcgctcagagcctgtctcacgactgaggcagcggagacccgcggctccc • 

tgcctaagctcccgcgctcacccggtaggtgtrctcggtcttgtgggaaacgg 

GCnrrGTCrrCATGGCTGCAGAGGGCCAGTGGCeCGCGACGGCCTCGGGAGT . 
GTCGAAGGGATGCAACeGACAGTAAGGAGGGGAAAGCGGCIiCACAGGCTAT 
ACTCTCCGATTCCCAGATGCCCAGACmcrCACGTGCGGCTTGAGCCCCTGG 
GCGCCGCeATGTTGGAGACAAGGAGGAGCCJGAGTGGGTCACGTGGACGGA- 
AAAAAGAACGGCGCAGGCGCACCCTTTGGTGGGGTGGGGCCTACAGGAGGC 
GGGGCTGCGCACATAAGGGCGGGCGTTTGGGTGAGGTGTTCTTTTCACTCCCT 
TCGGTAAAGGTTTAGAAGACAAATGTATTTTCATTATAAAATAAAACATACC 
TGTAATCGTTATCAACTAACATTACTGTCCCTCACTACGTACCTGCATCGTGC 
AAAGAtCCTTTCATCCATAATTTCACAGTAAAGCTTATTAGGGATGTTAATAC 
AAAGGAGGTACTGCGTCTATCTATATATCTATATATAGATATATACTTTTTTT^ 
TirmrmGAGACGGAGTCTCACTCtGTCTCCCAGGCTGGAGGGCAGTGGC 
GCGATCTCGGCTCACTACAATCTCCGCCTCCCGGGTTCAAGCAGTTCTCCTGC 
CTCAGCGTCCAAAGTAGCTGGGACTACAGGCACCCGCCGCCACGCCCGGCTA 
ATTCrtlTGTATTITAGTAGAGACGGGGTTTCACCTTGTTGCCCAGGCTGGTC 
GCGAACTCCTGAGCTCAGGCAATCCGCCCGCCTCGGTTTTCCAAAGTGCTGG 
GATTACAGGTGTGAGCCACCGCGCCTGGCCGGTACTGTGTATATTTTTAAGCC 
AATTTGACACAAGAGGAAACCAAGGTTGTGCAGCTGGTAACTGGCAGTCTTC 
ACTCAGACTCTAAATTTTGGTATTCTTAACCACTACGCTGTAAATAAAATTGA 
•AAAATAGAGAAGGAGTTTTAAGAAAAATCTTITAATTTCATGCACAACCGCT 
GTTAATGTTTGGTGTTTTTCCCATCACTTTAATGTCTACACATAGCATGTrGl^ 
AAATTGTACAGTTATATATACATATATGTACAATTTTAAAAAATAATTTAAAA 
ATATAATTATTTAGCCCATrTTACATTCATTCATTCATTCAGCAACTATTTCTT 
GAGAGTCCTCTGTGGACCAGGTACTGTTCTi3GGAGCCAAACGAGAAAGGCAA 
TAATAGTATTAATTAATTATTTGAAAATATTCACAGCCTTCAATTCCTGGGCT 
AAAGTGAGCCTCCCACCTCAGCCTCCCAAGTAGCTAGGGCTACAGGCATGCA 
CCACCACACCCAGCTCTmATTATTATTATTATTATTATTATTATTATrATTA 
TTATTtGTAGAGATGGGATCTTGCTATATTGCCCAGGCTGGTATCAAACTTCT 
• GGACTCAAGCAGTCCTCTTGGCTTGGCCTCCCAAAGTGCTGGGATTACAGGG 
GTGAGCCACCATGCCCAGACTGGAAATACTTGGTCTGAGATGCTTACATrTTA 
GTCAAGGAGTGATGTTAAGTAGACAATTGGATATATTGTTGTACAGAGCTCA 
CTGTAGCAGTCTGGGCTGCAGACGTGACTTAAATACACCTAAATACCTCAAG 
TGTGAGGCATTTGCGACTGGGAAAGAGAGCCAGTAAGGTAGAAACAGACTA 
GTTATGGCATGGAAGCAGAGCTGGTGGCCAACTATACTGCAAATTCGAGGAT 
AGAGAAGTGGTTGCTGGATCTGACAAGGTGAAGGTCACTGGTGACCTTGCTA 
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agtacagtctagttggaAgaaaggaagtgaaatatgcatggattgaggac^ 
taatcagagttgagaaagtgaacataaca:ctttcaaattgttattggcattat ■ 
cacatcaacattttctgtgtcaccacagggttgttcn'aaatattttaatgcct . 

. CC AT AGCAtlTATAGATTTrm ATAATAAAATATTGAOaT.GCCTA^ . 

attggctitcagtaggctaataattggtctttttt^^ 
gtcacagggaaatctaagaaagctgtagaacttctcccagaaagatgcaaag 
aagctcatccacactaaaatgtgcaggtttcagagagttagaagcctgcagt 
•taactgagggtagaagcccccactcaaagcaatatitrcccaacctgttgcc . 
. aaaatcacccccacatcattgtgccattttcctrctctgcaatitatt^ . • 
aatggagttAactggacaataaggagtgaagagaaaaatcaaagaaactga 

AAGAACTGACACAAGTTATATTGACCATTTATTGTTTGCATTGTTOT • •. 

CATGACCCTCACACACTGACrrGACTCrGACTGGCITAACAATATGTC • 
AACATGGTGGTTGAGAGCATAGATGGTGGAACCAGGTTGTCTGGAGGTAATC. 
CTGGTTCTGCCATTTATTAGTAGGGTCAGTTACCCnTrCAGAATGACAT^ . . 
GTATAteAAATGAAGCTAACAA>^GTAiG<:n"ACCTCACGGGTGGTGAl^ 
CAGAGTAAATAAGCTAATGGATAAAAGATGCTTAGAATGTGCCAGACACCAT 
AATTGCGCAAGCTCAGAAATTACnTAACCrTTCTGAGACCCAATT^ .. 
TATAAAATAGAGATGACAATAATACCAGTCTTTCAAGATACTGTGTTGGTGC . 
ATAGAGCATGAAATAAATGTITGCrAAGTGGATAAATAAATGAGCATTTATT 
CrCrACTACCTGGCTACCrGACAAGTTCCTTAAGGGCTGACACCTTATACTTT. 
TrCAACnTTGTGTCATCAGTTCCOTGAACATAAGAGTTGTTTAATGAATGTC 
TGTTGAATTAACAAAGdTTATAATACITACATTGACCCGGACAAAtn'AAGA 
GTAATATAGTTCCAGCATTGGATGTGAATTGCACCACATGATTTTTGGTATCT 
CTCAAAACTAAGTTTCTAAGCTTTAATGGAATACCGAAAGCATCCGTGAAGA 
GCACGAGTGTTTCTTATAAACATTTCCTTGCTrCCAGCGTrTGTC 
GGTATAGGATGAAtATCCTATACCATCCTtTATATCCTTmCCATTAGGGCT^ 
TGCCTGTCCTAACTACTCCAGCCAGTAAATAACATACATTATTTTTTCAT^^ 
TTATTCrrAAGCnTrATAGGGCGCTCCrGGAAGTITrGCTTm 
TCOTATTGCTTAGCGTGCTGCAATTTCAAGCCAAGAAACmAAGAGCATGT 
AAGAGCTAGGCCCAGTGGCTCATGCCTGTAATCCCAGAGCTTTGGGAGGTCA 
AGGTGAGAGGATTGCTTGAACCCAGAAATTTGAGACCAGTCTTGGGCAACAA 
GGCGAAATCCGATCTCTACAAAAAAtACAAAAATiAGCCAGACATGGGGATG 
TGCACCTGTAGTCCCAGCTACrCAGGAGGCTGAGGTGGGAGGAACACTTGAT 
CCAGGAGGTCAAGGCTCCAGTGCTGTGATCCTGCGACTGCACTCCAGCTTGG 
GTAACAGAGTGAGACCCTGTOTGAAAGAAAAAAAAGGAAGGAAGGAGAGG 
GAGAAAGAAGGGAAAGTAAAGGAAAGGAAGGGAGGGGAGAGGAGGGAAAG 
■ GAAGGGAAGAAAGCACATGAGGAGTTTACCCAGCCTAGACAAGAAAGTGGA 
ATCCAGAGAAGGCTTCTTGGGGGAAGTGACATCTAAGCTGAGACCIGGAAAA 
TG^VTAGGAATTAGCCAGGCAAGGAATGGACATGAATGGTGTTATCTAGGTA 
GAGGGAGGAiGTATAGGCATITGTCTCAAAATGGTTGAGACAAATTTGGTTAC 
• TrTTAGTTTCCAAGGCAAAGCCACACCCTGTCAAATTAGCATTGGCCATGGGT 
AT.CAT(nTTAGCATCCTTGGACTAATAATAGGAAAGAATAGGGACAAGATGA 
AGCTCAGAGGTAAGAAGAGCCTTGGTTTTCTCATCTGTAAAGTGAGATAGAC 
ATATGAGAAAGTCAACAGTGATATGAGACAGTTGAAAGATTAACACATTGTA 
TCAGTTCTGAGAGTCATATGAGACCGTTGAAAGATTAATTAATACATTGTATC 
GCTTCTCAGCGTTTTGGCrAAQATCAAGTCTAGTAATTAACACATTGTATGTG 
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GTGATCTGGAAGGGGGCAAGTCGACCTAGTGGCATGGTCTTGTO 
AGTTCAAGGAGTGCTGTGTATCAGGGCAGAGAGGATTTGACTCAGAACAAGG 
GACCAAGGAAGTGAATGTAAAAAAAAGAAAGAGAAGAGAAAGTTTTAGTAA ' 
TTCTTTGTTGGTITITrCTTAAATAGAGACAGGGGTCTCACTAeATTGCC^^^ • ' 
GCTGGTCTTGAACTCCTGGGCTCAAGCGATCCTCCTGCCCAGCCGAGTAGTTC •■' 
. TTGACTGTGGTA(3TAAGGAAAGCTGATCCACGTATCTCTTCTTGAGAAAACTG 
■.TGTATTGTtGACAGTGTGTGTAAATCAGGAAGCAGTGAGAGCATGGAGTTTG 
GATTTGGGACAACTGGGTCCCAGTrCTAGCATTTTTCATGTATTAGCCAGTAA 
CTGGQCAACTGA(JITAACCTrTCAGCCTCAGmCCTCATCTTTA^ 
ATAATAACTAGTTCTGCCrmCCCTTAACGGTTGCTAAGAAGACCATTCGAT. 
ATAAAGGAGGCAAAGTCCCCTGTAACCAATACAGAGGAGTTACAGAAACACT 
AAGTATTGTTTCC(nTTGCATTGTGTGATCATGTTCAGCCCTGATAeCACAGA 
GGTTCTATTCTCCmCCTTATmGAAGCTCAGGCATTAGAAACATTAGACC ' 
AGAAATTGCGGATTTGTGGGGCCTATAAGCTCAGGTAGCCCACAGATAAGTT 
TTGTTTACCAAACATATTCITCTrCTTCTTITIT^^ 

CCGTTGTCCAGGCTGGAGTACAGTGGGACGATCGTCACTTACTGCCAACCTCA 

AGCTCCTTGGCTCAAGCGATCCTCCCACCCCAGCCtCCCTGGTAGCTACAGAT • 

ACTACAGGTGTGCATCACCATGTCCAGCTAAtTTTAAAAACATmTAGAGGT 

GAGTCTTGCTGTGTTGCCCAGGCTGATCTCGAACTACTGGGCTCAAGTOATCT 

TCCTATTCCAGCTTCCCAAAGTGCTGGCATTACAGACATGAGCTGCCATGCCC 

AGCATACCAGATGATATTCTTGAAATITATrTITATmTTATAATCAGATACT 

CTCTCAGCAGAATCACAAATGTTTTAATTTGTTAAAAATCTGAAAAmTAGG 

TAAAACTCTAGAmTCAACTTCTCTTGAAAAGTAAAAAAAAAGAAAACTGC 

AATACTGGGCCCATATnTGAGAAGCAACAACCAGCTGGAGTTGAGTAGTAG 

TGGCTCTTTGATGCCACCACTTTGTCTCTGTGCACACCACTCCTrCCTTTTTGT 

CCTACCCCAGGCCCATGTCATGACTTAAGGTGGATACCTGGCCCCTGTGGAA 

AGCTCAGTGTGTAGCCTCTGCCTCAGAATATTCCTCAGGCAGAAGGCTGTTCT 

CGTCrrrGGTTITAAACATGCCTCATAGGCAGCAGAtrATTITrcrGTTGCTTC 

TGCAGCTGCTTTTATTGTTTAATGCAGTGAGTGACTCAACTTGTTGTTGCTGTT 

GTTGTTTCTGTTGTTTGAGACAGACTCTCACTCTGTCTTCCAGGCTGGAGTGG- 

ACTGGCGTGATCTCGGCTCACTGCAACCTCCACCTCTCAGGTTCAAGTGATrC 

TCCTGCCTCAGCCTCCCACATAGCTGGGATTACAGGCACCCGCCACCATACCT 

GGCTAATTTTTGTATTTTTAGTAGAGACGGAATTrCGCCATGTTGTCCAGGCT 

.GGTCTCGAACTCCTGACCTCAAGTGATCCACCTGCCTCGGCCTCCCAAAGTGC 

TGGGATTACAGGCATGAGCCACCCCGCCCAGTTGAGTGACTCAACTTTTTATA ^ 

AGGGAGTCAGTGCAGTTTTTCAGTTGGTATTCAAATATTTGTAACACCTTCCC 

TATCCCTGAACACACACACACACACACACACACACACACACACACACACCAC 

TGTGGTCTGTATTCATClTGTTTTCCrTCCTCACrTrCGCTCACCATTrGCATTT 

CTGTCATGGACTTTAATTTCGTTATTCTTTAAAGTAAGCTATCTCAGAGGATA 

ATCTAAATTAACCTGCTTTTAGAACAATTTAAACATCCACATACTTTTACCTA ' 

CCCCTGTTTATGATTTTTATCCTTTCTTTTGATCATTAGCTAAACrGTTGGCAT 

. GATGTTTAGGAAGGATGAGTAGTCrCACCACTGGGTTGTATCTCCCCTTTATT 
TTCTCACCITTCTCTTGGmGGTTTTGGTTTCCATTGTTACAGTGTGATrGCTT 
CTTTGAACACAAGGCGCTACATCACAGTACAAAGGAGCTTGGACTCTGCACC 
CAGCCCTCCCAGGCTCACACACTTGGGCTGCCACTGCTCTAGGAGCTTCCATT 

. TACTCATCAATAC(3GGGGATACTAGTGCCCCTCATGGGGTGGTTATGAGGAG 
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GCAATGACCTCATAGATTGCTTCTCAAGTGTGGTCCCTGACCAGCAGTATCAG • 
CACCTCATGAGAACTTGGTTAGCACTGTAAATTCTCAGAeCCTGCTCCAACCC 
TCCTGAATCAAGAACTCTGGGGATGGGGCCCAGCAAACTGCTTTCATAAGCC 
TTCCAGGTGATTCTGAGGCAGGCrCTAGTAtGGGAATGACTQACntAACAm^ 

• ACTACAGCACCTAGAACATTGTCCAACAC ATACCATGTGCTACAGAAAGTGT ' " 
TrATrCTTATTACTGTCTAGTCTrrACATAAATGTTTGCATCATC^ . 
TrCTTCTATTCATCTCrCTGATATAGTAGTATGATACTGTTAGCCT^^ 
TTATTTTTATGGATACATAACAATTATATATATTTATGGGCTACATGTGATATT 
TTGATACAAGTATACAATGTGTAATGGTCAAATTAGGGTAATTGGGATATTC 
ATCACCTCAAGCTTTrATTATTITmGTrAGAAACAATCCAGC.TCCCGTCT^ 
TAGTTATtTTGAGATGTGCAATAAATTATTGTTAACTACAGTTGCTCTATTA^ 
GCTACCAAACACTGGATCnTAmCTTCAATCTCATTGTATTm 

• AACCACCCCCTTTTTATGCrrCCTCCACTACCCTTCCCAGCTrCTGC^^^ 
CATTCTACATTCrATCTCCATGAGATCAATTTTlrTTAGCTCCC^^ 
AGAACACTCATATTATtTGTTTTTCTGTGCCTGGCTTAOTCATm • 
ATCCrCCAGTTCCATCCATGCTGTTGCAAATGATAGGATTTCATATTTTTTATG 
GCTGAATAATATTCCATTGTGTATATTTACTACATTTrCtrTACCCATGCATCC . 
•ATTAATGAACACTTAGATTGAGTCTATGTTGATTATTATGAATAGTACTGCAA 
TAAATATTGGAATGCAGATATCTCTTTGATATGCTGATTTCCTTrTC^^ 
ATATACCCAGCAGTGAGATTGCTGGATCATATCATAGGTCTAATTTTAGTTTT 
TTGAGGACCCTCTATACrGTTCrCCATAGCCATTGTACTAATTTACATTTCCAC 
CAACAACATATGAGAGTTCCCTTTCTCCACATTATCACCAGCACCCATTATTG 
CCTGTCTTTTTTAtAAAAGTCATTTTAACTGGAGTGAGATGATACCTCATO 
AGTTGTTTGTGGGGCTTmAAAATTTTGTTTTGTITGTCAGAOT 
CCCAGGCTGGAGTGCAGTGGCATGATCATAACTCACTGCAGCCTTGAACTCC 
TAGGCTCAGGCATCCTCCTGCCTCAGCATCCAAAGTAGCTGGGACTACTTGTA 

gttttgatttgcatttctctgatgAtaagtgatgttgagcacctttttacatgc 
ctatttgccatitgtatgtcitcntitcagaaatgtctatccaaatat^ • 
cattttttaaatcacaotatttttactattgagcttcttctatattct^ 
taatcccttgccagatgggctttgaaaatattttctccccatggattgtttctt 
gactttgttggttgtttcctttgctgtgcagaagcttttttagtttgatgt 

CTCATrrGTCCATrmGCCCrrGGCTGCCTATGCTTTTGAGGTCTTACTCAGG 
AAATCTTTGTCCAGACTAAfGTCCTTGAGCATrrCCTCAATGTTTTCTTCT^ 
. ATTTCATAGTTTCKKKJTCTCAGATTTAAGTATITAAGTCATTTTGAm . 
GATTirrGTATATGATGAGAGAAAGGAGTCTAGTTTCATTCTTCTGGATGTGG 
ATATCCAGTmCCCAGCATCATTTATTGAAGACACrrGTCCTTrCCCCAAT 
ATGTTTTTGATGCTmGTTAAAAAAGAATTGACTGGCTGGGCACAGTGGCTC 
TTGCCTGTAACCCAGCACTTTGGGAGGCCGAGGTGGGAGGATCATTTGAGGT 
CAGGAGTTTGAGACCAGCCTGGCCAACATAGTGAAACCCCATCTCTACTAAA. 

■ AATACAAAAAATTAGCCAGATGTGGTGGCACACGCTTGTAATCCCAGCTATT 
CCAGAGGCCGAGGTGAGAGAATCACTGGAATCCAGGAGGCGGAGGTTGCAG 
TGAGCCAAAATCATGCCACTGCACTCCAGCTTGGGCAACAAAGTGAGACTCA 

■ TCTCAAAAAAAAAAAAAAAAAGGAGGGGGGAGTTGACTGTAAACATGTGGA 
TITATTrCTAGGTTCTCCATCTTGTTCCATTGTITrATGTGTCTATTrTTATGC 

- AGTATATTATGTXn'TGGTTACTATAGTTTTGCAATATAATTCGAAGTCAGGT 
AATGTGATGCCrCfAGCTTTGTrCTTTATAAGTAGCGTCATmAAATGAGGG 
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• atct.gagoctcagagaact'gctaaatggtagaaagagttgaaqctgggtctt 
ctaacttcatgttcagtgctctgtttgatgtccccacactatcccacatcttaa 

. gagtgtaaactaataggggcaaatttagataaattggccaggcatggtgggt- 
cacgcctgtaatcccagcacitrgggaggcrgaggtgggtggatcacttgag ' 
gtcagagtttgagaccagcctggccaacatgatgaaaccctgtctctactaa 

AAATACAAAATTAGCCAGGTGTGGTGGCTCATGCCTGTAGTCCCACTTACTCA 
GGAGGCTGAGTCAGGAGAATTGGTAGAAGCCAGGAAGCAGAGATTGCTGTG 
AGCTGAGACCATGCCACTGCACTCCAGCCTGGGAGACAGAGCGAGACTCCGT-. . 

• ctcaaAaaaaggaaaaaaagataaataaatgcttggctgttgtagatatttg 

TAGATTTCCTrGTCCTCTCTTTTCAGCAGCAGTCCCCAACOTGTTGGC • 

gggctggtttcgtggaagacaatttttccacagactgggggctgcgagtgag 
..ggggtggtttcaggatgaaactgttctgcctcagatcatcaagcattagttag- 
attctcataagaagcactgcaagctggatccctgtatgcgcagttcacaatg 
.gggttccactcctatgagaatctaatgccgccgctgatctgacaggaggcga 
agctcagattgtaatgctctmgcctatctcrcacctcctgttgcgcagccc^ . 
gttgctaacaggccatgaaccggtaccggtccgcggctcaggggttggggac 
, ccctmcaggaaacagcatcctgattttcctttgaagaatcaatctgccttc 

ACTCTTAGTGTCTGAGGTTCGAGCTAGGGTCCAGGGAAGGTCTGTGTCCTAG 

GCTGCATTTCTTTGACAACATCATTGGCTTAGAGATGGGCAGGTACCCeAAGC . 

TGGGCCAAGCCCCTGAGCTGAACTTTTACTAGAGCTAAGCCATAGGTAAGAA' 

GGTGCTTCTTmCTTAGATTTGCCAGGATGCTAGGAGCCTGAAGTTTTTGGT 

GGTCCrCGTTGTCACTTCATGGAGAGGTCGGCCTGAGAATGAAGCCTATGCA 

AAGGAAAACAGAGGGGACAGGGAGAGAGAAATTGATAATAAGTATTGATTA . 

GCTTATTGGATCCATCCAGTCCTTGGACTTTCAGTAACCTTAGTTCACTAATTC 

CCT(nTrTGTGmAAGCCAATTTGAGTTGGGTTCTGTCATTTGCAACCA>^ 

CAATATAGAATTTATGCCGATAATITAAGATTTGTATGTTGCTCATAAGATTG 

GAAACCTAAGCCTTCCTCCGAACACAAATATGTGAGTAAGAATAAAAAAAGT 

CAAATACAAAGTCTTGATGATAACAGTCAATGGGATGGATTTGGAAAGTGTC 

TCAGACTAGGAGTCAGAAAACCTGGGTGCTGATCCTAGGAGGTTCACTTAGC 

CATGTGACTTTGTAAAATGTGTGAAACTAACGCTGTAAAATATGAAAAATTA 

ACCCTGCAACAAAAAAGTAATGTACACATACATATTTTGAATGTTGCATGAT . 

.ATAAAGTCAAGTAAGTAATTTATTAGAGTTGCTCCCAGAACTTTTTCTGTATA 

TCCAATAATTAGTTTTCAATAATAGTGAAGGAAAGACCTAGGAGTGGACATT 

TCTCAGTCTCTCATACACCATATGCACTGATCTGGGTGACAGAATACCTTTAA ■ 

agatacacttaaaaatgcatcttaagagaataagacaagccacagactggg ■ 
aaaaatatctgcaaaacacttatctaataaagttttgtattcaaaatatgcaa 
• acacccttaaaactccacaataaaaaacaaatagcccaattaaaaaatgaat ,. • 
gagagatctgaacagacacctcactgcagaatatatacagtatgaaaagatg . 
■ CTCAAC ATC atatgtcattagagaattgaaatttaaaataacaagataccac 
tacacacctattacagtggctccactccaaaatactgacaacactaaatgct 
ggaaaggacgtggagcagcagaaattctcattcattggtggtgagaatacaa 
aatggcacagccactttaggagacacctatttcttagaaagtgaaacatagg . 
cttaccatatggtccagcgattgtgctcttaagtactcatccaaatgaactga 
aaacttatagceacacaaaaaccagcacacaaatatttatagcagctttatt . 
' cataattgccaaaaattggaagtgaccaagatgttcttcaatggataaacaa 
accatggaacatttagacaatggaatattattccaggataaaaagaaaccaa 
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CTATCAGGCTATTGCAAAGTGAAAGAAACCAATCTGAAAAGaCTACATAGG^ . 
TrCATCTTTAAGACTCCAAATATGACATTCTAGAAAAGTTAAAACTGTAGAG 
ACAATAAAAATATCAGTGGTACTAGAGTGAGTGGGGGAAAGAAGGGAGGGA . 
TCATTAGGTGGAACACAGGGCATTCTTAGGTCAGTGACACTACTTnATGACA - 
CrGTAATTATAGATACATGACTAATGCAACATATTACTAATGAAGAAAATTG 
TGCGGGGAGGGAAAGTAGGCATATGGAAACTGTCTGTACmCTGTTCAATTT 
TTCTATAAACCTAAAACCACTATAAAAATAAAGTCTATtAATlUlTriTAAAA ' 
AATGTGTCTCTCTGTAATTCCnTCrGCCTCTCATCCTGAAGTCCT^^ ' : 

GGAATCAGGAAAGACAACACCTTACTCTTGAATCCrGGGAGCCCGAAGAAGA : 
AGTAGGACTGGAAtCrCAGGAGTACTAGAAACCAGGAAAATTTGGATCCACT. 
GACATCCAAAGTGGATCAGAAAGAAAAGTGAGTGCCTCAGACACCAATTGG 
AAAAGAAAAATTTATTTACTTTCTCAATGAAtATTTATTGAGTGCGTCCTGTG . 
TAAGGeACTGGTCTAAGAGCTGAAGATACAACAGTAAACAAAGTTTCTCCCG 
TCACGAGGCTCAGGTTCTAGTTTGGAGGGACAAAAAGAAAAAAACATGTGA 
ATTTATAGACTGTCAGATGCAATAAGTGCCATGGAGATAATATAGCTGGTGA 
GTAGTGTGTGGCAGAAGCTGTTAATTGTTCCGTAGAATTAATTGTTCCATTCA 
TCirrTAGAAAAAGAAGCCTCATCTCAAAGTTTAOTGGTTATGCAT^ 
ATTACAATCTTCATTTCCCCAGATCCATTCTACTCTCCTGCCCCCTTGCACCTA 
GATACAGCCTTGTGACCAAGGTCAGCCCCATGGGAAGTGCTCTGCGTAACTT 
CCAGGTCATTTGCTAAAAGATAAAGCTACTtGCCTTGGATTCTTTCTrrCTCTC 
•TCCTATTGACTGGGAAATAATGATTGAAGAATCCTTGGAAGCCAAAAGTGGA 
AGACAGCAGAGCCCGCAGTTGTAGTCTGTTCGATTCTTAGCTGTTACATGAGA 
GGAAATrrTTTTTTTTGAGACTTCCCCAATGCCTGGACT^ 
GGCAATATATCCCCr[TGTGTGTATAGGCAAGGCTGA6TCAGCTTTCTGTCACT 
TGTAACCrAGATTAATTTTTTCTGATTAATTAATTGTTCCAATTAGTACAAGA 
ATAATTTATATACTAGACTTrCTGCCTCACCTTGAAACCTGGAAGCrCACTGT 
ATCAGTTCTCTCTTGCCACAATGAAGTTGCTTGTAAAACAACCACAGTGGCAA 
GCAACAAGTCTACAGTTGGCTGAGTGGCTCTGCTATTCTGTATTGGTCTTGGC 
TGATCTTGGCTGGGCTCATTCATGTGTCTGTGGTCAGCTGGCAGACTGGCTGG 
GGGCTGGCTAATCTAGCACAGTTTTGGCTTAAATGACCCAACTACCTGGCTCT 
GCTrCACAGCATCTCTCATCATCCATTAGGCTATCCTGAGCTXrm 
AAGCAGGGTTCTGAGACAGACAGGCAGAATGCAAGGTCTAAGCTGATAATG 
GTGACAGTAGGAGTGGGAGAGGATTTTGTTGGGTAGAGTAAGTGAGAAGGAA 

ggtggatatttaaggatggaggaaatagtgtgtactgggaagaggtgcaaat 

tcctattgcaaaggggatggctagtagagggaggggtggagaattggagaga 

tttttggaatgagtctaggagatttatgataattattgcgataattaaacagt 

ggtggtagaga«gaaaaggaggaactggatagtggagcttaaggaggtgcgt . 

actaaggacaaattcctgagtagatgcaggggatgaaggtatgctattctga 

trgcagggatttagcatatgtgattttrntttaagttggaaataggcagaga 

gagcgtgttttgaaagagaacttaagggtgatcggggcgaatgtggatattgt 

ttgagaaaaaagaaggagggagcaaagttaggggacttggggagggggaga 

cagtgagrqagggacacacgtgggtgtagatgtttgcctggtgagtgggggc 

agaggagacgttccagggtggtgtggtgttgtcttgtgagggcaagggagaag 

acagcagtggggacgtggttggagttrggagggagttaggctgggtrgtgttt 

CITCCTTCCmCTITrtrTtrGAGACGGAGTTTT • 
• GAQTGCAATGGTGTGATCtCCGCTCACTGCAACCTCCACCTCCCAGATTCAAG 



■ pll 



11/107 



wo 03/104427 PCT/US03/18239 



. CTATTCTCCTGCCTCAGCCTCCCGAGTAGCTGGGATTACAGGCATGCACCACC. '• 
ACGCCCAGCTAATTTTGTATTTTTAGTAGAGACAGGGTtTCTCCATGTTGGTC •' 

• AGGCTGCTCTCGAACTCCTGACCTGAGATGATCCACCCACCTCGGCCTCCCAA 

• AGTGCTGGGATTACAGGCGTGAGCCACCATGCCCAGCCAGCCTGGGTTTTCA • . 

■ ACCrGGCCTCnTGCrrATTACTTATGAGACTTrGGACAAATTAm .• 

tatcaatgattgtaatagtactgggcctcatagggttgttgggcagattaaat. ; 
gaaagaaaggaaataaagcacatagcaagtgctcaataaattttagctatta 
tattttctctaaaaatacagcatmcctatttggtttgfttcttgt ; • 

agtctgggtttaggcattcaagagagctgaaaatatcataatactaaatatt . ' 
•tagatggcaaagaatgaattcaacttataaaagtacctggagtataaattca • 
cattrtcttgtaagaagagatattrataatctggtttatttgtttacttactaa • 

• caaacatttactgagagtctactgcgagtcaggcattgtagtagtttgct 
gctgcaacatattaccacaaacttagtgccttaaaacaacatatgttctgga • 
agtcagaagtctgaaatgGatcatctgggctgttccttatggaagccccagg . 
ggacaatctgtttctttggttittcxiagctrcrggaggctgccagm ■ 
gatgatggcctgtttcactccaatcrctgcttccaccatcacctcttctctccc. • 

■ .ttgactccattgtctcattgccttcrgtgacactcttgcctgtccatiatcagc . 
crctccatatggattgtgctttcttacagcatggtggcctcagggtagtcaga 
catggtggctcaaggctccaaaaatgagtattttcagcaagcaaaacaaaag 
ctccatgctcttrcatgaattcacattagaagtcacatagctttgtattccat . 
gtgggagttgaagaatgagaagtcatggacacagggaggggaacaacacac 
tctgggtcctgttgtgggatgagggatgaggggagggacaaatacctaatgc 
atgcagggcttaaaacctagatgacgaggctaggaagaaactgcatcaacta 
acgagcaaaataaccagctaacatcataatgacaggaccaaattcacatata 
acaataltaacrttaaatgtaaatgggctaaatgctccaattaaaagacaca . 

. gactggcaaattggataaagagtcaagacccatcagtgtgctgtatttagga. 
aacccatctcacgtgcagagacacacataggctcaaaataaagggatggag 
gaagatctaccaagcaaatggaaaacaaaaaaaggcaggggttgcaatcct 
agtctcggataaaacagacrrraaatcaacaaagatcaaaagagacaaaga 
aggccactacataatggtaaagggatcaattcaacaagaagagctaacJtatc 
ctaaatatatatgcacccaatacaggagcacccagattcataaagcaagtcc 
ttagtgacctacaaagagacttagactcccacacaataataatgggagactt . 

• taacatcccactgtcaacattagacagatcaatgatacagaaagttaaaaag . 
gatacccaggaattgaactcagctctgcaccaagtggacctaaaagacatct 
acagaactcrccaccccaaatcaacagaatatacattttttttcagcaccaca 

■ ccacacctattccaaaattgaccacatagttggaaggaaagcactcctcagc 

AAATGTGAAAGAACAGAAATGATAACAAACTGTCTCTCAGACCACAGTGCAA 
' TCAAACTAGAACTCAGGATTAAGAAACTCACTCAAAACrGCTCAACTACATG . 
GAAACrGAACAACCTGCTCCTGAATGACTACTGGGTACATAACGAAATGAAG 

• GCAGATATAAAGATGTTCTTTGAAACCAACGAGAACAAAGACACAACATACC 
AGAATCTCTGGGACACATTCAAAGCAGTGTGTAGAGGGAAATTTATAGCACT , 

• ■ AAATGGCeACAAAAGAAAGCAGGAAAGATCCAAAATTGACACCCTAACATC 

■ ACAATTAAAAGAACTAGAAAAGCAAGAGCAAACACATTCAAAAGCTAGCAG 
AAGGCAAGAAATAACTAAAATCAGAGCAGAACTGAAGGAAATAGAGACACA 

. AAAAACCCTTCAAAAAATTAATGAATCCAGGAGCTGGTTTTTTGAAAGGATC 
AACAAAATTGATAGAGCGCTAGCAAGACTAATAAAGAAGAAAAGAGAGAAG 
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AATCAAATAGATGCAATAAAAAATGATAAAGGGGATATCACCACCGATCCCA 

GAGAAATACAAACTACCATTGGAGAAXACTACAAACATCTCTATGeAAAtAA 

ACTAGAAAATCTAGAAGAAATGGAAAAATTCCTT(3ACACATACACTeTCCCA 

AGACTAAACCAGGAAGAAGTTGAATCTCTGAATAGACCAATAACAGGAGCT 

GAAATTGTGGG^TAATCAATAGCTTACCAACCAAAAAAAGTCCAGGACCAG" . 

ATGGATTCACAGCCGAATTCTACCAGAGGTACAAGGAGGAGATGGTACCAfr 

CTTCCTGAAACTATTCCAATTAATAGAAAAAGAGGGAATCCTCCCCAACTCA • 

TITTATGAGGCCAGCATCATCCTGATACCAAAGCCTQGCAGAGACACAACCA' • 

AAAAAGAGAATTTTAGACCAATATCCTTGATGAACATTGATGCAAAAATCCT 

.CAA.TAAAATACTGGCAAACCGAATCCAGCAGCACATCAAAAAGCTTATCGAC 

CATGATCMGTTGGCTTCAtCCCTGGGATGCAAGGCTGGTTCAACATACACA • 

■MTCAATAAACGCAATCCATCACATAAACAGAACCAATGACAAAAACCACAT- 

GATTATCTCAATAGATGCAGAAAAGGCCTTTGACAAAATTCAACAACCCTTC 

ATGCTAAAAACTCTCAATAAATTAGGTATTGATGGGACATATTCTCAAAATAA- 

TAA.GAGCTGTCTATGACAAACCCACAGCCAATATCATACTGAATGGGCAAAA 

ACTGGAAGCATTCCCTTGAAAACTGGCACAAGACAGGGATGCCCTCTCTCAC 

CACTCCTAXTCAACATAGTGTTGGAAGTTCTGGCCAGGGCAATTAGGCAGGA- 

GAAGGAAATAAAGGGTATTCAATTAGGAAAAGAGGAAGTCAAATtGTCCCT . 

GTTTGCAGATGACATGATTGTATATCTAGAAAACCCCATTGTCTCAGCTCAAA 

ATCTCCTTAAGCTGATAAGCAACTTCAGCGAAGTCTCAGGATACAAAATCAA 

TGTACCAAAATCACAAGCATTCTTATACACCAATAACAGACAAACAGAGAGC 

CAAATCATGAGTGAACTCCCATTCACAATTGCTTCAAAGAGAATAAAATACC . 

TAGGAATCCAACTTACAAGGGACGTGAAGTACCTCTTCAAGGAGAACTACAA 

ACCACTGCTCAGTGAAATAAAAGAGGATATAAACAAATGGAAGAACATTCC 

ATGCTCATGGGTAGGAAGAATCAATATCGTGAAAATGGCCATACTGCCTAAA 

GTAATTTATAGATTCAATGCCATCCCCATCAAGCTACCAATGACTTTCTTCAC 

AGAATTGGAAAAAACTACTTTAAAGTTCATATGGAACCAAAAAAGAGCCCGG 

ATCGCCAAGGCAATCCTAAGCCAAAAGAACAAAGCTGGAGGCATCACGCTA 

CCTGACTTCAAACTATACTACAAGGCTACAGTAACCAAAACAGCATGGTACT 

GGTACCAAAACAGAGATATAGATCAATGGAACAGAACAGAGCCCTCAGAAA 

TAACGCCTCATATCTACAACTATCTGATCTTTGACAAACCTGAGAAAAACAA 

GCAATGGGGAAAGGATTCCCTATTTAATAAATGGTGCTGGGAAAACTGGCTA 

GCCATTTGTAGAAAGCTGAAACCGGATCCCTTCCTTACACCtTATACAAAAAT 

TAATTCAAGATGGATTAAAGACrrAACATGTTAGACCTAAAACCATAAAAAT 

CCTAGAAGAAAACCTAGGCAATACrrATCCAGGACATAGGCATGGGCAAGGA 

cttcatgtctaaaacaccaaaagcaatggcaacaaAagacaaaattgacaa 

ATGGGATCTAATTAAACTAAAGAGCTTCTGCACAGCAAAAGAAACTACCATC 

agagtgaacaggcaacctgcaaaatgggagaaaAttttcgcaacctactcat 

CTGACAAACKSGCTAATATCCAGAATCTACAATGAACtCAAACAATTCACAAG 

AAAAAAA.CAAACAACCCCATCAAAAAGTGGGCAAAGGATATGAACAGACAC 

TTCTCAAAAGAAGACATTTATGCAGCCAAAAAATACATGAAAAAATGCTCAT 

CATCACTGGCCATGAGAGAAATACAAATCAAAACCACAATGAGATACCATCT 

CACACCAGTTAGAATGGCCATCATTAAAAAGTCAGGAAACAACAGGTGCTGG 

AGAGGATGTGGAGAAATAGGAACACTTTTACACTGTTGGTGGGACTGTAAAC 

TAGTTTAACCATTGTGGAAGTCAGTGTGGCGATTCCTCAGGGATCTAGAACTA 

GAAATACCATTTGACCCAGCCATCCCATTACTGGGTATATACCCAAAGGATT 
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ATAAATCATGCTACTATAAAGACACATGCACACGTATGTTTATTGCAGC^ 

TrCACAATAGCAAAGACTATTGTCrmGCTATrTGGAACCAACCCAAATGTC. ' 

TAACAATGATAGACTGGATTAAGAAAATGTGGCACATATACACCATGGAATA.- 

CrATGCAGCCATAAAAAATGATGAGTTCATGTCCTTTGTAGGGACATGGATG 

AAGCTGGAAATGATCATTCrCAGTAAAGTATCACAAGGACAAAAAACCAAAC ' • 

GCCGCATATTCTCACTCATAAGTGGGAATTGAACAATGAGAACACATGGACA ' ' 

CAGGAAGGGGAACATCACACTCCGGGGACTCTTGTGGGGTGGGGGAAGGGG 

GGAGAGAGAGCATTAGGAGATATACOTAATGCTAAATGACGAGTTAATGGGT 

GCAGCACAGCAACATGGCACATGTATGCGTATGTAACAAACCTGCA;cATTGt 

GCGCATGTACCCTAAAACTTAAAGTATAATAAAAAAAAAAAACCTAGATGAC 

GGGTTGAGAGGTGCAGeAAACCACCATGGCACATGTATAGCTATGTAAGAAA . 

CCTGCATGTTCTTCACATGTATCCCAGATCTTAAAATAAATAAAAAATAAAA. 

ATAAATAAATAAAAATAAAAATTATATTAAAAAAGAAGTCACATAiGCTTTAT . 

TTCCATACCCCATGGGTTGAAGCAGTCACAGCCCATTCAGATTCCAGGGAAA : 

GGGACACAGGCCACATCTCTTGATGAAAAGAACATGAAAGAATGTGCAGTTA 

TGTrTTAAAAACATCCCAGTAGAGTTCACGAACATGAGTTTTTACAGCAGAC- 

ACTACATTTCCCTGCCAGTTTACCTGCCTTGGGATGGTGGAGGTCTCTGAAGT- 

tggcagtcgtttcctgcaggattctaagttggatggcagcagctctccagctc ■ 

tgaggcaacgaaactgaaagctagtggagagttgcctgaattttgccttctc 

aggtctttccataagitctgtgaacactcaatttcctgtatcaaatrcctrctt 

cttgaaaatgcttagagtgatacctgttttttctactggagtctgactgattc 

aagctccaaagtctgccctcctaactgcctctcgcgttgttctaaacctttct ' 

ggtgctcctggcctgctcctttgcaacccacacagactcacacatccagcata 

ccctaagaagatgacactgcctcttagtgctcacaaaaggagtgcaagttat 

atgaacctcaactatccmctatccaactggaactgtatctgtcrgtttttcc 

(xcttctgctccgtcttagaagaaaagttcatcaatacttttgggaaaaaggt ' 

aaacirrtaaacacgatgcatggcacccttcatitatcttttcaacctgatttt 

ctgccatctttttatatgtgcccatgttaattatggtagactaactgctttacc 

aaatagactcataaaattggtggaattattgctgcaacagaaagaccgaaag 

gtctacaatggcctaaaccctgtaaaagtttatttcttgctcaaataacaatt 

ataggcaagcaaataatcagtggcatatgcccttctctatgtagggacttaa ' 

actaatagagagacaaccarrtccttccctccctccctccctctctctctctct 

TrCTTTCTTTCllllll"iUlllll"iTll"ll"llTrGATGGAGTCTTGGTCTGTCACC 
CAGGGTGGAGTGCAGtGGCGCGATCTAGGCTCACTGCAACCTCCGCCTCCCG 
GGTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGCACCTGGGATTATAGGCGC 
CCACCACCACGCrCGACTAATITITGTATTTrrAGTAGAGACAGGGTTTCG^ 
ACGTTGGCCAGGCTGGTCTCAAAGTCCTAACCTCAGGTGATCAGCCCGGCTTG 
GCTTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACTGCGCCTAGCCTGAGA 
CAACCATTTTCAACATTTGCTTTCCAAGGTCACCCTTTCCAGACATCCAGAAG 
AGAATGTAAACTAAAAATAAAATCCTAAGCCCCCCAACCAACTGAACAGACC 
.CCCrCTTGGCCAAGAAGACCTCAAGAAAAACrrAAAAACTGAATTCCTGGCC 
: ATCACAAGAAGGGAAGGTCCAATATGCTTTGTTATCCTCCCTCCCTTTTGGAG 
TTTAGGCACAACTGAACAGCATTAGTGTTAAAATAGAGATCGTAAAGCTAAC 
• AAAATGGACTTATTGTCACAATAAGATGCCAAATTACAAATAGGACCTAACA 
■ CGACACAAGAAGGGTTGAGTCACACATTCTTATAATTCACTGTAACCCAGTG 
TACrGGAAAACAATATCrTAATATGAAATATTCCTTTTTTGCTGCCTCCGAAT 
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TmAGACAAAGCTTTATTTCTTTAACCAATTGTAAATTAAACAGTCTCT^ • 

TTTACTTATACCCTGTAAGCACCTGOTCAAGATATCCCACCTTITCAGG 

AATCAGTGTAAATACCTTCCATGTGACtGATTTATATCTTTGCCTGTAACGCC- 

TGCCTCCCTAAAATGTGTAAAACTGTACTGTMTCCTACTACCAAGGGTGCA 

TTTCTCAGGACCTCnTGAAACTGTGTTCCCCAGGCCATGGTCACTCATATTGA 

crcagaataacatctttaaaatattitagccaggtgtggtggctaacacrc^^ • 
aatcccagcacritgggagaccatggctagaggttagcttgaggccaggagt • 
.ttaatgcccagcctgaagaaoatagcaagatcctatcrctacaa>j>l^^ 
aaaaaattagctgggtatggtagcacatacctgcagtcctagctgctcagga ' 
ggctgaagtgggaggatcgctrgagcctggagittgaggctgcagtgagdta . 
tgattatgatc attccactgcactccagcctgggtaacggagcaagaccttgt ' 
ttaaggaaaagaagaaaagaaaaagaaagaaagaaaagaaaagaaagaga 
gagagaaagaaagaaagagagagagagaaaggtagaaagaaagaaagaaa' . 
gagaaagaaagaaagaaaaaaagaaagaaaAagaaaagaaaaagaagga • 
aagaaagaaaagaaagaaagaaagaaaaaatttm atggagtttagtttct 

CCATTAACGGGGGAAGCATACAGAGGGTATATGTGAGAGGCTTTTATGGGTT 

agccctggacatggtacacatcacttctgctcccattccactagctagaatte 
agtcacaaggctacatctagctgcaagggaggctgggaaacgtcacctaact 

gtgtggccaggaagaagagaaaatgagtttrggtgaatgaccagctagcagt 

GTGTGCCATTGTGCCTTAAACCITGGCATGCTTATTTAATITrCCTATAGTTCA 

GCTTTGTCATTTCAAAGTTGAGGATCATCGTGGCACCTCCTTCACAGGGCTAT 

TGTAAGGATTAATTACATTAGTACTGTGAAGACCTTAGAACATTGCCTGGGTT 

TTTGTAAGCATCCCATAAAAGTTATCTACTATTGTTATTATrCTTGTAATCCrr 

AGGGACTCTGCTTTGATACTACrrCCTTACATGAAATCTrTCTrCTC 

CCTCTGGCCCATTCTCTCrAGCACCATCATATrGtATTATAGCTGTCTGTTTTC 

TGGTTGGTCCCTCCCCTA^^TGCTTCTTGAGGATAAGGATTATGTTtAGCTAC 

ATTrrAAGTAACTAAGATAGTCTCTGGCATATGATAGGTACTCACTAAATATT 

TGTTGAGGGAGTAAAGGAGAAAGTGATAAAGGTGAGAAGGAAAGCCACAAA 

GTGGATGAGTATCTTGTCATGGGGAGATCTGTGAAAGGACGGTTTGATAAAT 

TGTCCCTTCATGGATTGTAGTCACCATCTTCCAGGTCCCAGAGGGTCACTTGG. 

.TACTGAAGTCTTATCAGCCTAGGGACCAAGGCTGCCTGCCCTTCATGCCCTGT 

GTGAGACACCGCTTCTGAACTGCCTCACGTCCTCTCAACCTACCTTTTCTCCA 

GTCGCTGCTGGAGCACCCAGCACCCAGAGAAGTGAATCTACTTTCCCTCACCT. 

tgtaacrgagtctttcaactitgaatgcattitaaaactitrtrcctctc^ 

tagtcttaagatgtaacctcgaaactgagtgtagaaactctcttttccttagt 

cttaaaatataccttgaaatgtactttgaaactctgcntccctctctttcccac 

cagactctcccttgeaccatggacacgtatctaactgtatacrrgttaataaa 

ttccaggggctgattttacacaacagccaggcaaggagacccagctgcagaa 

trctcccctacttggagattacctcatgacagacatctataacctgactgcaa 

ctgagatggcaccagccagcactccaagtggacaacaactcaaaataaccct 

tggaagaagacatgcaggcctgtatcttgtgttactctggcatggttctcata 

. ctaagtatcccctttttatttmatttatttttaacatat^^ • 
gtctctctctgttgcreaggctggagtgcagtggtgtgaacgtggctcactgc 

. agccrcaactcccaggctcaagcagtcctcccaactcttagcctccagagta 
gctgggactacaggtgtgcaccaccacatcrggctaattttttitgtatctttt 
gtagagatqggtmcactatgttgcccaggcragtttcaaactcctcggttc 
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■ . AAGCAATCGGACAGCCTCGGCCTCACAMGTGCTGGGATTACAGGCATGAGC 
•. CACTGCGCCTGGCCAAGTCTCCCCCTTTA^^TCCCTrCCTTCAGTCTAACAC 

TrGAAATGGTCTmGGAGGCACAAGCCTGGCCATTTCCCAATTGCTAGCATT- 
.TGAATAAAGTTGCTTTCCTTTTACTCGCTTCTCATATmGGCTCTCAAGTCAT • 
GAGCAGCCAGACTTGCATTCAGTCACAACGTGGCTGCCCTGCATCTCATTCTG' • 

•ccctggggttgctcttggtcaacttccitggacattgttagaaatgtat^ 
actgtcacatatgcaagtatgcaagtgcaagggagttggcatctcatggggc • 

• aatgtaaccactgcaggacatgaactagttgataaatattcgtctcttctgac 

• ■ tcttgagcagatgattctgaaagggtgatgaactcatcctggctttccacgga 

. TCTGAGTTTTCTCTGAGtTAGCACTGGAAATCCCATGTTGCTGTTGGTTACOT 

■ aattctgaaatgcattatgcaggactccttagggggtcccagtgagatcaag 
trccagtttcccacagcttaataacacagattgttttgccrm 
tgggattactrcccaatatatactgcctrcacacaagcttgtgtcttcggctc • 
tgttttctatggggaatctagctggtcttrgaattgggcaatgcrtagcttat 
ctgtgcagtgatgagtgttrccaa(mtaggaaacttaaaaagacagaggca 
agtgaaataagtcagggccccaaacagccaaatt'atcactttcctgacgttg 
ttacaaaattagtctggaaggtatggcactgitcaaagacagtgtaatgaaa 
tatitattataaggtttaattgcttgaacagccagatatagaaaggactgca 
gacagatagtcaggagaotaacttctggtcccacrrttagtatttaaaatcc 
tatgtgacctttgacacgcttctccttgggcctcagtttcttcat.ctccaaatt 
cttggctgctactctgctaagatcaagtgtacaaaattaggaggttaggcta 

GATTTn^GCTTTTACCAGTTGTrCTATTGCAGCAGGTGCAGTTATACATGATG^ 
GCAAAGGOTGTGGGGTGTCAGGGGCCTTGTCCCTTCCTAGCGCCACTAGGG 
TTAATAGGCCTGGTGGCCTGTTTCATGTCCATTTCACTAGTTGAGCACATATT 

. GAAGAAAGATTTTATATATGGGTCATAGTGGACCATGAAGAACTCCCAACTT 

, ATTTCCCTGACAACATTCirrTriTrCCnTrGAGACAGGOT 
AGGCTGTAGTGCAGTGGCATGATTTCGGCTCACTGCAACCTCCGCCTCCCTGG 
TTCAAACAGTTCTCCTGCTTCAGCCTCCCAAGTAGCTGGGACTACAGGCATGT 
GCGACCATGCCTGGCTATrril'nTITTGTATTTTAGTAGAGATGGGGTTTCAC 
CATGTTGACCAGGCTGGTCTCGAACTCCTGACCTCAAGCGATCCACCCACCTC 
. CGCCTCCCAGAGTGCTGAGATTACAGGCATGAACCACTGCACCCAGACCCCT 
GACAAGATTCrCGAGATrTAACAATGTCAAATCTTCTTTCCGCAGTGCCAACT ' 
AGACAAGAtATCAGGGCCAAAAA^TATTTGGCTCTGAGATGTAAGAAAGGTCT 

. TTCTAATTACTGTAGAACTTGAGGTCATGATCATGACAGCCAGTGGTGTGCTG 
CTCCTACCTCJCTTGTAAAAGCCAGTTGCTAAATATTCAGGAATGTTTCAAGC 
AAACAACCACTGGTAGCTTGAAATCAGTCATGGGGAGAGTATTGACACCACA 
. GAAGTTGACAGATGCTACAAATCCATGTTCTCTCCACATCCACCCCfCCCTGA 
GCCAGTTTACCAGCAATTCCCCGCATCCAGCTCTrAACACTGGAAGTTTTTAT. 
CCAGGACrATGGCTAAATTAAATATGTGTGTCTTAAAAAGAA'rilTrrrCCCT 

■GCTTITTGGTITGTGGATACAAAGAGCTGAATCATTAAAACTTGTTTTTCTCT 

• CACCTCAACAGTGGCTTTCTTTATAGATTCTATGAATCTGAAAGATACAAGTT 
CCACCAGAGGTCTTGGTAGGGTTCCACAGCCCCAAGCTCTGCAGGTTCTGAC 
AAAGGATAGGCTGGGCCCTGCCAGGAAAACTCACAGAACGACATCAAGGTG 
TTTGTGCTGCTGCTTTAAGAGTTTGTCCTCAAGACAAATAAGAGTTtGCCTAT 
TTGGGGGCTTTGGAAATTTTITITrAGGATTTTGGAAGCCrGTTCTGI^ 
CrAGTTGTGTTACTGGACAAGTTAGTTAACCTTGCTGATCCTGGGAAGTAATC 
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TCAGGAAAGCAGGdAAAGGCAAAACAGCGTGTGTTATTAAGCTGTGGGAA^ 
.CTGGGACTTGATGCCCCTGGGACTCTCTAAGCAGTCCTGCAAATTGCAm 
GAATTACkjATAACCAACAGCAACACGGGATTTCCATGCTAAACCAGGGAAA 
ACTCAGGTCCCTGGAAAGCCAGGATGAGTTCGTCACeCATTCAGAATCATCT 
GCGTAAGAGTCAGAAGAGGGGAACATTTATCCACTAGTTCATGCCCTGTAGT 
GGrrAAGTTGCCCCATGAGA;TGCCAACTCCCTrGCACTTGCATACTTGTATAT • • 
GTTCAAGTATGGTGCACATGCTTGAGACAGGATCTCGCTCTGTTGCCCAGGCT 
GGAGTGCTCTGGCTTATCACGGCTCACTGCAGCCTCAACCACCCGGGCTCAA 
GTGATCCTTCCACCTCAGeCTCCTGAGTAGCTGGGAGCACAGGCATGCATCAC 
CATGCCTGGCTAACrTAAAAAATTlTllTGTAGAGATGGAGTCtCACTATGTT . 
GCCCAGCCTGGCCrCAAACTCCTGAACTCAAAtGATTCTCCTACGTTGGCCTC 
CCAAAGTGCTGGGATTGCAGGAATAAGCCACTGXGCCCAGCCAACCTTGCTG' 
.; AGCTTCACTTATATGTGGAATGGATGAGACCACTTCAGAGGGITTATGGGTTT 
GTCATGAAGCTCAGGGAGCTTAAGGCTCAAGACACCTCACTTGTACAAGCTT 
CTCACAGGAGGGCTTTAGGTACACTGTATTTGTAATTTTCTAGTCTAAAGACC 
CCrGCCCTCTTCCCAAAATGTGTCATCTTCAGAGCTGCACAAACCTTGGATCT 
GATCCTGCrTGTGAGGATGAATCACGTGTAAAACACTTATTCTTGAGTCTGGC 
CCATTGTAAAAGCCTCAGAAATAGTCCnTGCTGTrGTCAGTGGAAGTACTTrC 
CCTCCATTATCACATATCCACATATTCriTCTCCCACTGCCTAAATGTCAATGC 
CCATCCACGTCTGCTCAGAGTTTGCTTCTTTTCCTATAATGTCAGTTTCTTGCC 
AATCACTCCCCACTTGCCATTTTGCCAGTCTTATTCCATAAGGTGCATCTGCA 
CTTCCTCACTGCGTCCTTTTTCATAATGGGAAAACAATGTCCAGCGATTATAT 
TCTTCACAGAAGTGATGTGACAGCAAAGAGAGATCATAAATGAATCGCCTGA 
AGCTTGTTGGTGTGAGGGGGAAACCAAAGTCATATTAATACTTAACAAAACA 
AGCAACAGGACAGGAAAACAAAAGGTAATTAAGGCAAAGCTGTGATGTTTT 
GCCAGTTGTTAACATAAGAGGCCAATTGTCAGCTGACTGTGATGTAAAGACG 
CTTCCTTTAAGAGCATGTGTAATATGTATCTCAGAATCATCAAAGGGCTCTAA. 
GTCACCCTAATAATGGGTCTGTACCACAAACAGAGAGAATGCAAACCACATT. 
TTGTCTTAAAAGACACAGCAAATTGCACTGCAGCTGTAACAAGAATCTCAGA 
GTCATTTGCATTAACTGGGGATGAGTGAAGGGGCTAAGTGGAGTGTCTGTGT 
GCAACATGGCACrrCTTCCTTGACCTGTGAGAAAGGAACTTGACAGCGAGGC 

tcagtggctcatgcctgtaatcccagcactctgggaggctgaggcaggtgga 
. tcacgaggtcaagagttcaagatcagcctggccaacacagtgaaaccccgtt. 
tctacraaaagtaaaagaaaaaaaaattagccgggcatggtggcggacgcc 
tgtagttccagctacttgggaggctgaggcaggagaatgggatgaacccggg 
aggcagaggttgcagcgagctgagatcgtgccactgcactccXgcctgggag 
acacagcgaaactctgtctcaaaaaacaaaaaaggaaagaaaagaaaggaa 
cttgacttatatacacttaggtgcagccatcattgagggcttgttgtgcaagg 
tgctatggatggtgatgagtaaaaccaggtgtcctttctattatgctgtgctg 
gaagcttctttggcaagtgaagggagtgtggcctttgggatcagatgaatct 
ggtggaatcctggctctgtgctcagcatatgatttagacatttatgtaacctt 
cttgagcctcaagtttcctcatctgtaaaatggtaacaatactacccatctta 
cagaatcacagagaggattaaatgggaaaaaaaagacaaagtgtetgaaat 
atagcaagtrcrtaataaatattaactttcttacccccttctggaggcataga 
atcttagtgcaatcttggtactctcagaaactgtttatgtagctcatctgaat 
cttatttttttagtagttgaaacattttgtcltaatacaga^ 
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. TGTAGCACMGAATTAGAATATITATTTATATATATAiTrATTTATT^ • 
'.CCAAGTCTCACTCTGTTGCACAGGCTGGAGTTCAGTGGTGCAATCTCAGCTCA 
CTGCAACCTCCACCTCCTGGGTTCAAGGGATTCTCCTGCCTCAGCCTCCTGAA • 
TAGCTGAGATTACAGGCGTGCACCACCATGCCCGGkjrAATTTTGTATATTTTT 
AATAGAGCCAGQGTTTCACCATGTTGGCCAGGCTGGTCITGAACTCCTGACCT 
CAAGTGATCCACCCGCCTCAGCCTGCCAAAGTGCTGGGATtACAGGGGTGAG ' 
CCACCGCGCCCGGCAAGAATTAGAATATlTAGATAAGAAAAAAATTCAAAAT 
TACAAATAATrCrAATACGTTAACACCCTTGGGTGTATCCTXeCAGGACACTT. 
TTATGCCrGTTAATACATACGTATTATTAAAAATATAATTATACAATACATAT - • 
TATTTTATAACCTGCTTTGTTTTCATACGTTACGTCATTATGGGAAATCTTTGA 
AAAAATCTCTCCCATGAAAAGGCCAGCTAACAATTATGGGAAGTATGGGAAG 
TGGirrCGATTAACATACTGTGACAACTATCTATtATTAAGAAAGTCACGACA 
AAATTTTGGTGCGCCTTTCCCAGATGAAGGCCACATAGCTGCTATAAGGGAC 
AAGGACCAGTAATATCTTCAAATCCTCTTTTTGAAGTGCCITCAm 
. GATGCAlTrTCTTTrAAGAGTTTTATAAAATCTAGGAGGAAA 
GTGTTTAGAACAAGGCAAAAGAAATTTCTGArrGGATATTGTTATGGGCCCA ' 
TGACTTCTGTTGCAGAGAAGGTAATAGAAAAAGGTAAAATACTTCTACGCTC ' 
TATAATTCACCTTGCfGGAAAAAAAACAACTGGATTGGCTTGACAGGGGCTT 
AGACGGGTGACCAGGTTACTGTTTGGTTGGTTGAGAGACGGAAGCAGTACAG 
.AATGACAAAAGTGTGTGGTGGGCCACCGGCCACTGGTTCATCATAGCAGGAC 
CrCAAACCAATGTCTAGTCCATGAATGTTTATATATGGGTTGGTATATGAAGG 
TGGATATTTGCAAACAAATGCTTAGTTTTAGTGTCAGGATTTTCTTCCT 
AATGAAAAGAGACTGTATGTTTTCAAGTTCTGTAGGCCTAACTAGAAAGAAA 
AGAGTTCAGGATTTCAGATTGTGCTACTTTCACAAGATGTAGGTATATCTTTA • 
CCAAAACACACATGGCTATGCACATGTTCAAGTATCTTGTTATAAGAAGGGT 
GTGGTGTAAGTGGAAAAATTGCTrCTGTTATrCTTGTGAGGCAGTGTAAC^ 
GTGGTtAGAAGCACTGATTGAAGAGCAAGACTACCTAGTCTTGAATTCAGCT 
TCACCAACTGTTAGCTGGGCAATCTTGGGCAAGTTAGTTACTCTTTCTGAGTC 
TCTATTTTreTTGACCTGTAAGAAAGGAAATTGATGGCCAGGAGTGGTGGCTC 

atgcctgtaatcccagcactttgggaggtttaggcaggcggAtcacaagttc. 

. AAGATCAGCCTGGCCAACACAGTGAAACCCCATCTCTACTGAAAGTAAAAGT 

• aaaaaacaAaaaaaaacaaAaaaagaaaaaacaaattagctgggcatgatg 
gtgggtgctgggaggctgaggcaggagaatggcgtgaacccgggaggcaga 

GGTTGCAGTGAGCCGAGATCGTGCCACTGTCTTCTAGCTTTGAGACAGCGAA 
ACTCTGTCrCAAAAAAAAAAAAAAAGAAAAAGAAAAGAAAGGAACATGACT 
TATGTATATTTAGGTGqAGCCATGATTGAGGGCTTGTTGTGCAAGCTGCTATG 
GATGGTCATGAGTAAAACTAGGTCTCCTTACTATTATGCTGTGCTCKjAAGCtT 
. CCTTATGGGTATAATAATAGTTGCTCCTATtAATGAGTATATCCACCTCATAG . 
GGTrGTTGAAGGATTrGAAAAATATTTTAGAAAGTAACAACnT^ 
GAACAGTGCCTAGTATATAATAGGAAGTATGTGTTAGCTATTGCCATTTTATT- 
AGAGTTTTAACAGGTCAGTCCAACAGAACrGGCAATTTCGTGAGtGATACTTT 
TirnrmCCTGGGACAGAGTGTTGCTCTGTTGCCCAGGCTGGAGTGTAGTGG. 
TGTGATCTTGGCTAACTGCAACTTCTTCCTC(n'GGGTTCAAGCCAT^ 
CTCAGCCTGCAGGGTAGCTGGAATTATGAGAACGCACCAACACGCCCAGCTA 
ATTAtTGTATTTTAGTAGAGACGGGGGTTTCACTGTGTTGTCCAGGCTGGTCT 
CCAACTCCTGACCTCAAGTGATCtGCCCACCTCAGCCTCCCACAGTGCTGGGA 
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TTACAGGCGTGAGCTACrmAATTCATTCATTACn'CAAACAACACGGT 
TGGATGTCTATATATGGTTAGGCACTATGCTAGGCTCTGGGCAACTGAAAAA- 
AAAATGCCAATATTCTATATCTTTGAAAAGTAAAAAeTGACTACTCGTGTCTT 
•CTCAAGAAGCTTTGTGACTGAGGTTATGCAGGTCTTTTTATCT^^ 

TCCTGGAAGGAGGACAAGACTAGCTCACCTGCAAAGAACTCAGTCACITATC 

AGGAATGAATGAAGTGCAGAGCGTATTGGGATTCCCTAACACCAACTCTTCr 

TGAACATdCACCrTTGTeAAACCTGCCACTGTCAGAGCTGCCGACACAGGCA 

ATGGATGGGGACATCAGGACAGGGCTTGGGGTGGGAGCCTGCAGTTCCTGGA 

ATTTGCCCTGCTGACCTCACACTAGGTGATTTrATCCTACTrCCCAGAAAGm 

CCCCCAGTTAGTCTAAAGTTGGGATGAAGCTACAAATTATTTGCTCTGAAAAT ■ 

TCTGAGTCrATTCTAGAGTTAAtrrCCTGTACATGTTACATGGTTTGCATrATT 

AGAAGGGCCAAGGGGCCCCGGCAGTGCATTCCTmCATTTTCTGTAAAAGG " 

GATCrGTGGGACT(nTrCATC.TTCCATTAATGATGAGAAtGTGGAAAGGGGA 

GGGTGGGGTAAGAGAACTAACAATTATTGAGCACTTACTCTGTGCCAGGTAC 

TTTGTACATGTrCCTGGCACAGAGTCAGTGCTCAATAATTATCGGTTTtTGGG 

GGAACTGAAACGAAAATCTGAGAGGCCAGGGGCCATGimrGTCAATACGT 

CTATTTTGGCAATGAGCACATTACCATGTATATTGTCAGTGGTCCATGTTGTT 

GAATGATATGATTATACACATGTTATGTGTGCATATCCACCACCTATCTATGT 

ATCATCACTTGTCrATCACTTATTGAATCITAGCTAAAGCTCATGCTATTTAA 

AATTACAATTGCCTTGGTCTCAGTTGAATACCATCCCATAATTTCTATGTGAA 

ACAATTGATTTTGTGATTTCTATTTCATAATTGCAGAATTAACTAATTTAT^^ 

TCAAGTCCTTTGAACTTAAAGATGTrTTCAGTAGTGTGGATrGAATATATCTA 
TGGGTTTGATGAAATTAACTTTTTTTTT]T^ 

TTTCCCAGGCTGGAGTGCAGTGGCGCAATCTfGGCTCACTGCAACCTCTGCCT 

CCTGGGTTCGAGCAATTCrCCTGCCTCAGCCTCCTAAGTAGCTGGGATTACAG 

GCGCCCGCCACCATGCTCGGCrAATTrmGTATTTTTAGTAGAGACGGGATT 

TCATCATGTTGGCCAGGCTGGTCTCGAAGTCCCGACeTCAGGTGATCCCCCTA 

•CCTCGGCCTCCCAAAGTGCAGGGATTACAGGCATAAGCCACTGTGCTCGGCC 

GAAAGTAACTTTTAATTGTGATAATTACATTGCCCITATTTCCATTACGCATG 

AAACAGAACTCCTCTGTCITCTTATGGTAAAATTTTGGCACTGAAGACCAGTT 

GAAGATGAAGGA.TGqATCTTGGTAAGTTGAAAGAGGAGAGAGAGTGGGGCC 

AGGGAAGCCACAGTGGGCAACAGATTGGCATCCCAGCTTTCACCCCTCTCTA 

TCATTCTTAGGTATTTCAGGTCAGAATTAAGAGGAACTAATTGGAGATATCAT 

.CTTCTTGGAATTGTTTGTGCCTCATAACACTrTAAAAATGCATTGACGCAGTG 

ATTTCTGAAGTTCAGTCATrTGTGTATT<}CCTTCACGAATTITGCTATACCm 

GAATCACCTGTACTCATACTGACTTGAGTTTTTAATCATCTCACTTTCTTAAA^ 

TTAAATACTTITCTTTTTTGGGAAGAGGATCTCATATTATrACTT^ 

TCrCTTTTtCCACACGTTCTAGTTATATCATTACTATAAAATTGAAGACAGGC 

ITCACTrGTTAAAAACTGAATGTAACTATAAAATTCAATCCAAGAAAAACAA 

AGAAAATGTTAGAGAAAATirrAGCTAAGCCTTATTTTTCnTGGACTrGAAGG 

CTAAGTTITGCTCCCTGtCAGAGGGTTGCTAAAAGACCAGCAGCAAACTGAG. 

AGTTTACCnTGGAGGTAATCAGAAAGAAAAAATAATTTTAAGGGGAATAACr 

TTCTCCCCCATATAATTCGGTGTATTTTAATGCCTGGTTGAGCeACTGTAATA 

CTATGGAAGATTATGTCATCTTCCGTAATACACATTATCTCACACATGGCAAA 

ACACTTCATGAGAAGAGAAACTGACATCTCAtGATTTTGACCTTCACCACATA 

TACAAGTTTITrTGTrAGAAATATCACTCACGAA>U^ATGGATGAAGTAGTOT 
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CCATGCAGAAGTTTGGACATTTCAATTTAATGTCrCTGGTAAAGATmCCAG. . 
TTAAAAGCACATTGACCACAATGTTTCGTCTTCTCTTTAACACTGAAAAGGGG • 
AGGAACGGCCTTATTAGGAGTCAATACTATCAAAGTCAATGGAGGCAAGGGA 

■ CCAATGGCCCATCATAGTCCTAATATCACTTACTGtTTTCGAAGGAGGACATC- 
TTTCATAGTGGTATCTACCATCTCTTGGGGAGAAGAGCAGGAATGGATAGAT • 
TAACCrCTTCCAGACATCTGCTCACAGtCCCGATGCGGTTTCTCACrTAGAGG • 
GTTTTTCTTAGAGATCTTCrCTGCTGTCCTCTGCAGCTGTCAGGGCATTCTCCA • 
GATGGGGCCTGGTAGGAGTCeTTGAATTGACTCAGGTCCCACATGTCCCTGCA. 
GTTCAmATGCTTCAGGTCAAGGGTCACAAATATTCACTTGATAAGGGATGA . 
CAGAmGTCACAGCTAGCTGCTGAAAGTGGAGTTGCATGAAGTGCACATTT 

■ AGCTTGCATGAGGGAGAGCACAAATTGGAAACTTATCAAAAITGCCTTGGTG 
GGTCTCTTrCAAGGCTrCTrTGGAGGCTGCTCTCAGATCTATXACCTCTGGA^ 
ATTCTTGAGGACnTrTAAAAATACAATGACATCTCATCTCTTCTrAAATTCTGT . 
CrAGTGGACAATTTGTGAGGGGTGTATGTGGGAATCTGCAATTGAATTCTTCC , • 

' TCAGGTACGGTGAGTTGGGCCTGATTCCAGCTCCACCGCGATCATCATCAGAA 

■ TCCTAGTGGGAATTTCCCTTCCTCAGAAGACTGAAACTCCACCTGTGCTGGTA 

■ ATGCCAGGCCCCCTACATAGCTCCAGTCTATTACCTCAAGGGAACACAAAGG ; 

' CACACTCTTACAATATATATTTACACCTGAGCTTGTAAATATGACAGCTAAGA . .. 
ATCAGAAATGTTGTTTTACTTAAATGGGCTrGtCAGTAGATTGGGAGTATATT 
GTATTCCAAGGAAATGCTTTTTAGATGTAATmAGGTACAAACTCTGGATTT . . 
' TCCTTCACGTACAAAAATTAAAACTTTGATTACACACTAGCAGAAATGACAG 
CAGTAAGAGTTTTTTTATTTrAATTGTATTTATTAAACAATTAGGAAACACTG 
TATACTTreTAATTGTrcrAAGTTGCTTTATAATTCTTAATTCATTTCAACTCT 
ATTACAACCTTATGATCACCTACtATGTGACTGTAAAGCACTGTACTTTATTA 
CATTAGCTCTAATTCTTACAATCATCCTTGAAGGTAGTTAGTACTCCCATTGTr • 
ATGAATGACCACrCTTCAGAAGCAGGAGGGACCCTCATCeAAATTTGATTTG 
GATGTCAAAACrGATGATGCCACACATCCACCTCACATAGGTATGAAAACTA 
• • TTATTCACAtAATGAGGCTTTCTGGGGATAGCAGTGTGGTTTCCAAGCAGATA 
AAAAAAAATGGCTTGAGAGAGAGCACAGAAAGGAGACTGGCTTAGGGTTTG- 
trrfGTtGGTGGTTGAGGTGGGGCCAGGGTGAGGGTTCCCACACATGGTTTGA 
AGTTGCCATGCAGCTTCCTAGTAGCAAAGGAGGGAATATCCTGGTTTTCTTAT 
CAGTTCTCCTGGATATGGGCAGATGAGGAAGAGGGAGGGGTAAGGCTTAAA 
AGCTGGCAGTAGTCAAAAGTCAAAAAAtGGAGTCAGATTCClTATTATACCC 
■ATrtTACCAATGAAGAAACTGCGATTCTAGGATAtCATGCATrGTGTCAGGCA 
TGACTTTAGTGCAGTAATTCTCTGTGAAACTTAGCCCCACACTGCACTGTACT 
GrGGTCTTGGAGAAGTAGAAGCAATAGGAAATACACACACACACACACACA . 
• CACACACACACACACACACACACACACACAGTGATTTACTATAAGGAATTAG 
CrGACATGATTAAGGAGCCTGAGAAGTCCAAGATeTGCATTCAGCAAGACAG 
AGACCCAGGGQAGTTAATGGTATAAGTTCCAGTCAGAGTCTAAAGGCAGGAG 
AAAATrGATGXTGTCTCAGCTTQAAGACAGTCAGGCAAAGAGAAATAATTCT 
TTCTrACTCAATCTTTTAATCTATTCGGGCCTTCAATGAATTGGATGAGGCCC 
ACCCACATrGGGGATGGCCATCTGCTTTACTCAGTCTACCTATrCAAGTOT^ 
ATCrCATCCAGAAACACCCrCACAGACACACrCAGCAGTAATATTCAGCC^^ 
• ATAfCTGGCTATTTCATGACCCATACAAGTTGATATATGCAGATAACCATCAC 
AGTATCCACCirGAAAATGCACATCCCTTATTATTGAAGGGGAGTGGGAGGC 
AGAGGAAATCCTAAACACCATrGGAAATCTATATATTCTAGAGAGACTATGA 
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aagcaatgtack:trggcatgetggaaggagcatggtctrtgggatcagaa 
toggttataatttggctttgccatttattggctgtgttatcitgaaaattgct . 
tagccttactgagttttagtgatacagaacatctccagtgacatgcaaattta . ' 
ta^^catcgcattctattitgggggtctccatrggaaagctctltotaaat. 
•aatggcatttccctatattaggtttggggtgcatacatagtctctactgg^^^ 
•atgaaggcaagttaccaggaattccaatgtataaggacacagctggccatct 
"ggaacaaatattggaagggattttggagaacagaggttagatccaggggca • ■ 
ataaggtccagactgttcacaagagatgagatgaaggccatgctctgcctat 
•ttaggaatccacaggacactgagAgttaccccaagacaagagaaagttcaa 

■ atacccaagacattgattgactactggcaatagtttttggcacaaccctggg 
amgctgcccaaatgttgtttcagtcttgtcccaggagagctcagttctcag 
gctajctccacagcctaccatgtragcaagcccaaaagtgaatatgtcntrtg 
•taatttttccagcaaaaattcgagggctgactctcattgatccaaatlt^^^^^ 

ACAGGCCCATCCATGAATCAATTGCTGTGACCAATTTACTAGGCCTGAGTCAC 
TGGTCTGTCCACCCCAGGTGTCTTGTGGTCAGCtCACCTGAACAGCACGGACT ' 
"GAGACCCAAGAAAAACTGATGCACTGTTACTAAAAAAGTGAGGAAGTGAGG 

ctgagtaggcaaaacagcagatgtccattacaggaaagatctccaaaatgta 

acctcactcattttttctgtatgtgtaactctgtgtgaacttagcgcctgcccc 

aagggtggcatttAcitagactgtgatgaaaatagagaccctggatttgtgc 

agtgctctgtcttcccccctctctttctgctaccacgatttctccaacttctgg • 

trgtatcaggtttcagctaagggagggaagacatgacttaacgcataggctt 

ccatagagattccatactgggacttcaaataggtcacactactggtgagtcc 

cagcacacagatcctagtgccaagatactcatgctgttgattcagaacttcca 

gacatacatgtgcttcctctgacagggaaggctgctactacctatatcttctg 

aattggttcatattaatcataggtcatgtgtaccactggtaacaatagtcatc 

gmagtgaatgtttacaagtgagaacactgcctattcataagcttgaaatta 

tctgtgaatttgggaatgtgtgccagccgtaagctgaccagacatacttactg. 

gtcattaatcaacagggttttgtitccttatccctatgttrgactgagacaaa 

ttcctctccctacatcactcaaatgtggatacagaagtcttctccccttcctat 

TTATAACCTCAAAAGGTTGCAATTTGGACTGGAGGTAAGGAGGAGATAAAGT 

acttacaaactacttgtggaattccgcgagtcccctccaccccctgcltrrtcc 

ctgtgtcttgaccaaaaatcacagagtaccttgatcacactatgacacaacc 

agcagcaggcttitcccackjaggcttgacaccagggctttgaacattcccag 

gccgtcatacaggtateaaggtrracgaggaagaaactggtcctagccttag 

cccaaatccrtaaacctttatataaactccatgccctgacctcctcacagcag 

acataactagatagaacaccgttgtctcttgctgtttgttgcaaagattgcta 

.cagccrrctctgtgcctaagtttttctaatcaatgctttggatggatgaaaaa 

GAAAAAAAAGAATTTATAACTAAAAGGAAAATATTGTGTACTATATATTATA 
TATAGCATATATAATATATAATCTGTATAAAATACATGTAACATATAATCCAT 

■ TGTATGTTATATGTAATTTTTATGGAAATACAACAAATTATAAGTATAATAAT 
TATGTATGTTACATATATATATATTTTTTCACGTTTTTAACTTGAGGTTTAAGT 
ACCTGTGATCrTITITnilTi'il'iiiri'i'iTGAGACAGAATGTAGCTCTGTCAT 
CCAGGCTGCAGGGCAGTGGCTTGATCTCGGCTCACTGCAAGCTCCACCCCCT 
GGiaTTCACGCCATTCTCCTGCCTCAGCCTCCAGAATAGCTGGGACTCTAGGCG 
CCCGCCACCACGCCCGGCTAATTTmGTGTTTTTAGTAGAGATGGGGTTTCA 
CCATGTTAGCCAGGATGGTCTCGATCTGCTGACCTTGTGATCCGCCTGCCTCG 
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GCCTCCCATAGTGCTGGGATTACAGGCATGAGCCACCGCGCCCAGCCTATCT ' • 

• GTGATCITAACCAATTTATOTGCCTTTTC^^ 
GGGAATTCAAATAGATCACTAATATAAAGACAATTTAAAGTGCTTGACTTGC. 

• CAGTGTTCITATAGCAAAACTTTCATAAAGTTTAtGACACTTCnTrGAm . 
AGGTITrAGGGGTACAGGCATATCITATrATAGAGTATTOTCCT(^^ 

■ ■ AAAGATTTAATTTACTACGCATCATITGCAAGCrCCTTGACaTITCTGGATCC 
CGACCACCACCTGCCCTAATGGATACTGACACCAGCTGTGAATATGTATAAT • 

• ATAAACTGGCCAAAACACATGGCCTGGGCAGGTTAATAAOTGCTGCCCOT 

caaactgatattatttaaatitgtattraacaotgaacactgtcataggct 
■• atcctaaaatgaaccccaataatccatgcctcttggtagttatgccctgtgta . 
• atctcctccccatgagtgtgggctgacctagcaacttgcitttaactActata • 

• aaatagcaacagtgatgggctgtcatitcrgtgatggtgttacataagagttt 
tacrrctgttttattaacagaccctctcaatgccttctcagcttgcatacm 

.• atgaaacaagcagctgggttaagagatgtccatgtcgcaaggaaatgaggg ■. 
. cagcctccatcaacagccagcaaacaactgaggctctcagtctgacagccca • 
tgagtaactgaatcctggcaacaagcatgcaagcitggaagcagatccrtcc 

• ccagtcaagcttttgaatgcaacctcagcccctgctgacacttaatggtgcct 

TGTGAGGGCCCTGTAGGCAGTGGAGCAAGCTAAGCTGTACCTGGATTCATGA 

cccacacaaactgggaggcaacaaatgtgtgttgimacaccacaaaattt . 

■ gtggtaatttgttacacagtgacagattaatacaagtactgagtggggaagg 
ttgcatataccattcagccaaagctctcttgttaactggaaccaccctaatta ' 
caagataatttaatgaatgactgtgttacctgaacacacccttcagagaccc 
catcacntgcatgagtcagagcrctatgggcrrcaagtgacagaaatccattc 
taattagcttaagtaaaaaagggatttgttgatttacatacctgagaggcca 
agggtgtaccaagcaccctcaatttcagatgtactaaaaaggactctgactg 
gccctttttagatacatgccaatcattaaagatttaatgtatctataataaga 

• tactatataataagatattaggaccaatcattgtgtctagttggatagagttc 
tctgagtggttggcttgggtttcatgcctacccttgcggtggagaaggggag ■ 

• • aagaacattaeatttgaaagccttatcatgtaatatacagatagttacccaa 

AGAAAllll''ni'ri''rri''ril'GCTCTAAAAAGGCGAGAATGTACACAGGGCAGG 
CAGGCAAAACAACAGCTGTCTACtATACTGCCTACTACCCAGCTGGGAGGCA 
AAGTGACTCCATCTTGGATGCTAACCTGCCATGTTGACTTCTGATTAGCCACA 

ATCCrGTGAATATCTCCTGATrCCTA(JmATTTACrGTTTGTOT 
TGTCAACCTTGATGTTATCACACACATTTTTGCCTGTTTGGGAGGGTCGCCrrT 

■ ■ AATTGTCTTGCTGGAGCATGTATACCATTTTCCTGTCATATTCATATATAAGC 

CTTGGGTCAGCAGAGTAACAGTGCAAAGATTTACCTGTCrTGTGGCTGCCTAA 
" GACCACACtTCTATCrGTAAGTrCCCCCAATAAAACACrCTTTGCCAACAAAC 

■ TGGATTTGTCTGTCTTGTTCTTTGGTTTCTCAGCTCCTTTGGCACTTGAGGGCC 
AATTTGTATATATGGCCCTTTCACAGAACATCAGCATCTCATGAAAATATTGC 

• TTCGCATCACATACAAACTTCTCnTCCAAAGACATTCTGGTAAATGTGGAATA 

TTGGGTGtCxiTAGAATTCGAGTGATTTTAGAGATTTTTTATTGAATTAT^^ 
. . TATTGACATATGGTAAGAGTACAAATAGATGAATTTTCAGGAAGTGAGTTCAT ■ 
CTTAAAGGACCATGCAGATGAAGGCACATTACCGTTATGGGATCTTTGGGAT 
■ GTGGCTITTGTGGCTGGAAACCTCTGTGGGGGTTGGTGCGTTTGCCTGAGTTCT 
TGTCCTGCATGCAGGAAGAATGAAGTATGGAGACAAGTGGAGGGTGAAGAA 
GATAAAGAGGAGGTTTATTGAGTGTGAGAATAGGTCAGAGGAGAGCTGCAGT 
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•dGGTAGCACCTCn:CTGtAGGCAGGCTbTCCCAC:CGAGTAAf CGGCT " • 

'AGAAAGGAGGCGCTGGAGAGGGTGGCCTCTCTCTGCCAGCTGGTCATCCTGT 
CAeCTCTGCAGCTCITAGCAGAGAGGGTA.GCTCCTCGGTGCATCTGGTCACCT " • 
CATATCCCGCTATCAGCAGAGACAGTAGCTCATCtCTACAGAATGGTCATGC 
CATCATCTCTCTATTCTCTGCCCTGCTCTGGCTGAGCCTGijGGTTTTTATGGAC • • 
CTCAGAGGGGAGGAAGTGCACAGCAACTGGTCCATGGGCAAC'CATGGATGG 
GCCAGAAAAGGCACCACAGGTCCCCACTCTGGTACGTGGGTCTGGAAGCCCT 
GCeCTGAGCCTlTGGGACCrCCCTGGTCTGAAGGT(jGAGCCITATGC^ 
CCATCrCCTTCTGCCCAGGAATCTGTCTGCCTCCTGCTGCTGTTCATGGCTGCC 
tAGACTCAGCCCCAACGTTGTTCCAAQATTGGAGCCAGCACeAACAGCAGGG 
AAAAACTAGGCAGCAGGACAGOTCTTTGGAGCCTGCAAGGGCAGGGGGCC 
TTCCTA6i3CCGACAATAGTGCAGGGATGCCTGAGGCTGGAGCCrGGCCCAG^ 
■ AGGGTGGGGCTCCCACCtGCTCCGTGGAGTTGGAGGCTTGGGTTCTGCAGCGG 
TGGTTTGGGTGGCTGCAGCGGTACTCCAGGAGCTCCTGCCCTAATTTGCAAGG- 

ggtggggctcrrgcttgtccccggctccrgccgactccatggaacatgcagcc 
ccaggctgcctcsccatctgcagccagtgtgatgacagcagtaagccatctgg 
agtggccactgccatcattactgacaccacagaagccctgctitatgcitctt 
trcrgtrcttatccacatccctctcccccattccaggacaaccacrgtcctgac 
. tnrctaccaacattgattagtgtttcctacttrtatattttatgtgaagggaaca 
ataccacattctctgtttrgtatttggcttttttgcttaacat^^ 
gtttcatccatattgttgtagttctagatttgttcacatttctgcatagaattc- 
catcgtgtgaatatatgacaatttatttatccattctaccgtcratgggcact 
tgggaaatttccacttggggactatratttaaagtgctgctacaaacatgtta 
. gtgcttgtctttctgtgaacacatgtatgcattggcatacacatatgagaatt 
tctgggtcttagggatggcatatgttcagctttagtagatgctgccaaactgt 
titcccaagaggtrgtattaacttacaatgctaccgatagcatatgtggattc 
tggttgctgcacatctttgtcaagacgtggcattttacatcitititat^ 
ccattctggtgggtaagtactaatatcatcccattatggtmaatttgcaat' 
• tccctgatgactgatgaagaggaacacttitraataaacattrragctatttg 
aatatcctcttgtttttttgtaaatctittggtaatrgtt 
-th'cttatcgattrgtagaaattcttrataaattctggtraggagtgactcatc 
taatataattattgcaaatattgtctgcaacntratggtcgctttttcatctr^ 
: atggcitttgatgaacagagttctcaatittaacataatctatacttrrttct^ 
ccititatatitagtgttrratrgggtcctgtgraagaaaatgtggccacctcc 

GAATTTAATTCCAGTGATGTCTTATCATATCTAAGTGATATTTTAAGGACCAC 
ACGTTCAGGCGACATCAATCATAGACAGTAAGACAGGTTGATAGtTCAGTTA 
CCTTTGAAAGAGGCTTAATTmAAATTTCCATATAGAACCCAATAGeCAACA 
AATATGTCTTGGGATGTCACATTTCAGTAAAATATTGCITCCACCnTACrrTATT 
TTTTTGAGTCACAGCTrAACCGTCAAGTTTAGAGGAAAACTCAAAATGTTCTT 
TACrC AA(nTITCCTTTTCTrTCAGAGACAGCTTCGCCrAAAGAAAAAAGC AC 
■ CCATTTAGTACAATGGGCTTGTGTCTGATGCTATTTAATAGCAAATC ACTTTC 
TGTCrCCTAACCATAGTAACCACGTCrGCAAAAGTTGAAGAATAATTCCTGCT 
ATGTCAATATTGCAGTGTTGTTATGAAAATAACAACAATAACAATAGGATGA 
AAAAGTGCCTGCAATGCAGTGATACTTTATGAAGTAGCGCACTGCCAAGAAT 
GrTGGTGAATAAGAGTGACTGTAACACAGAGTCTGAGGCTTATAGCCTATCG 
TGGTGACAGGAAGGTACCATAACCAAAATTTTCAAGGAGAAAGTTAATCTAC 
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CAGAGATTAACTCnTrTGATTXTCTCrGGAGAAGTGTTCCrrOT 
TrCCAAAGCACAGCCAAAGGCTTTATAAGTTTATATGCAAATAATAAAATCA ' 
CACAACCAAATCTGTAAAAGATTCAGCAGGTGAATGTCAATCTTTAATATGA 
TACTAACATTTATACTGTACACAAACCTATGGCTCTGTTTTGTTAGTTCCTGCT 
CAGAATCTGACTACCnriTCACTGAATATTTTGGA^ ' ' 

AAATCCTGCCTTTTGACCAGGTACAGTGGCTCATGCCTGTTATCTgAeCTACT- ' ' 
.CTGGAGGCTGAGGTGGGAGGATCACTTGAGCCCAGGAGACTGCAGTGAGCTA 
GGATTGCACTACTGCCCTCCAGCCTGGGCAACAGAGTGAGAGTCTGTCTCAA 

AATAAAAATAAAAATAAfAAAATTCTGCCACTGATTAAACCCATTTrCAM^ 

■ATTCnrrAAGCAtTTCTGTGAGAGACAGTTTACAAGACCCATGAGAAAACCT ■ ' 
GtCTGTTTACmTATAGTGTrATTTTTAACCAAAAGTGGCATTATCCT^ 
CATCAGACTtCACTTTGAAAGACTTTAGACTGTGTCTAAAATCACAGCCACCA 
TCCAAGCAAGGTtGGGATCAGTCAAGTTGTTfACATACAAGTGCACAGACAT 
AGGGTTTCTGGAGTTAAAAAAAAAAACCAACCAACCAA'CCAAATCATCCCCC 
TCTTCCACCAAATrCCAGGACACCCAGTTAAATTTGAATTTCAGATAAAGAGT 
GAATAATTTTTCATTATAATTTATGTCCCAT'GCTTATAACAAATrCAAATT^ 
CTGGGCACTTGTATTTTGTCTCK3CAGCCCGACCCAGGGCTGAAAATAATT.CTA 
GAAGAAGAGTTTCTGAAACTGCTTTGTTAATGGCAGCATTATGAGACTAGAA 
•TATAGTCTCTCAAGGCAACAGCCCtCATTTACACATCTAAATTATGAGATCCT 
TTTTtAAGAAAGGGGATGACCTGATCTITCTGTAAGACTTCATTACATrG 
•TCAAGAAAAGGACACATTAGCAGGTAGCAAGCAAAAAAGTATGTGAATTTC 
ATTAGTGTTCATTGTTTGTTACACCTTGACCAGGCTCTTAAATTAGCAAATAA 
GCAGTTCCTrATAACCTTrCCAAAATCTACCTAT<3TTTAm 
GATCAACTGGTTTTACTCAAATATTGTAAGGAATAATGAATAAAACAAATAG- 
AAAAGTrATGCTACCACAACAACAAACAAAAAGGGAAATTCCCCATTGAAG 
ATTGiGT.CTGTGAGGACCACnTCCTGGTCTTAACTTTGCTTCCTCTGACTCCATT 

■ GGTAGAGAGGTACGCAAATTTCTAAGGGAGCACCCTAGTGCCTCATAACTCT . 

TTGGTTTAACCATCATCTAGTAATAGCCACCTGTCATTAAAAAACCCAAGCAG 
TGAAACACTGCCAACACATGGAAGGCGTATA<3AACTGAGAGGGCTGAGGCT . 
GTCAGGCATGGGAACAGGTATTTTTGTATCCTITCAAATATTTCAGTAGTGCT 
TATTATATAGTCAGTGtCTTCCTGACACAGTATGCCTTCTATTCTCAACATGA ■ 
ATTTCCTTAGAGTCACTTTTCTTrGTGCTTrCGATAGTTTCCAGTTTTA 
TTTATTTATTTATATATATATATTTATTTATITrCTAm 

TACTTTAAGTTTTAGGGTACATGTGCACAATGTGCAGGTTAGTTACATATGTA 
TACATGTGCCATGCTGGTGTGCTGCACCCATTAACTCGTCATTTAGCATTAGG 
TATATCTCCTAATGTTATCCCTCCCCCTTCCCCCCACCCCACAACAGTCCCCA . 
.GAGTGTGCTGTTCCCCTTCCTGTTTTATTTTTAAAAGGGAGAACTCACOT^ 
GTAAAAG^^-CATC AATAGAAGAATTTTGCCAGAAAGAAAGAAAGGCAATAC 
AGAGTCTTTTCTAATTCTGTCAATGGAAATGTCTTTTTTAAAAAATACACAGG 
CTTTCnTrGATTAGTCAATTrTTrTCCCCAAGAGTTGTATCT^ 
TTTTTTTATTCAAGAGTTGTATCTCACAGTGAGAAAGAAAGAAAGAAAGAAA 
GAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAAAAAAAAACTTGAAATAA 
AGAAACTTGAAAGAAGGTTGAGAATTGTTACTTATn'GCAAATCTTGAGTTrT 
GGCCTGAGAGTGGAGAGTAATTGGAACATTGAAGATGAAAAAATTTCTAAGA 

■ GATTAAAAAAAAAAACAAGAAAAGAAAAAAGAAAAGAAAGAGAAACAACT 
ATTAATTCTAAGAAAAGCACAAGGTTACATAAAAAAGCCAATTTCTTCTTAG 
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■ CACATTGTGTGACrTAGCrGTGAATAATATTTATATAATGAAAAATAAAXTAT ' 
GTGTTGACCATITCTCTAACAAAAAATTATGACATATCTCTCTTGGGGAATTA 
TGGAACAGGAAGTGTGTGTATATGTGTGTGAAGGTAGGGAAAGGAGGGGGC. 
TAGGTAAGAGAGCTAAATTTCCCTAGCAGGAAATTAACAGAAAATACCTAAG 
CTFTAAAACATCAGGAAATAGCTATCATAGATAG€TACGTAGACATATAGTG . 
GTCAATATTAAAAAGAAACAACTAAAATAATATAATGTACTTTTTT^^ • 
TmrriTrTGCTGGTGAGTGGAAATCAGAGGTAGGGACACTAtrGCTGT^ . 
GGTTAAAAGCCrTATAGCCTTATITGGCTGTTITGCTGATATACACGTATTAG 
CCTAATACAATTAAiui^TmAATAAAATCAGAGACCCTTCCAAGAAACrAAT 
ATTCAAAAAAACAAATATCAAAAGTAAAAAAAATAATAAAACAAG AGGAAA 
CAAtTGTCTGGGGCCAGGGTAGGGTACTACrCCrAGtrCCAAGCCAGTTTTTA . 
AATAAATGGACmAGGAACATAACCTGTTCATAAGGATGGACTTTCCACATT . 
TCAAACCCAGTAATGGTAGAATAACTGCCTTAGGACjATTCTAGATACAGCCT 

trcttrccgccaaclt(3ccccactgttcgactgtttctcatgatggtgggggtg 

aacagcatcctctcggatgcitgtaaagcaatggcctgaacagaggtaatgt 

tmaggtccatgtaactccattttcctagacataaagttgagagttaagttt 

TGAGGCCTAATGGTCCCTTTCCTAATCTGAGAATGAGTTGGAAAGCTCAGCTC 
TCCTTCCTTTCTCTGGGCTGCTCCirrCCAGGTGAAGGGGm 

agcgaggagatttgtcagattaagcacacagggtgcgatggaaggtaaatta'. 
aattgacgatagattcagcttaataagagtttgcaaacttrccaaattcttca . 
gtgaaggttcaaatttggagggaaaagtgaaaggtgttaatttactgggttta • 
ggttggtctggcgaagttttccacttgagtagtaaaatcggtccaagttgtag 

"AATTGGAAGAGCAGGGAGGGCTGAGGtGGGAGCATTTCTCGCATCTGTGATA . 
GGGCAAGCTTCTGAGGAATCAATTATCCrrCTTTTGGCATGAAACTCCCATTA 
GGCAATGACAGTTGCAACATAACCACCTGGTCCAGGTGGCTACACTTAGGAG 

. AAGAAAGTAGGGGGTAGGGGGAAAGAGGCCGAGAGTGCGTTTGAGCAGAGC 
TGTTGCAATGTGAAGAGATACTrCTTGCAGCTTTAGAGGAAAGGAAATTCGG 
AAGATiTGCmTGTGGTGTTGTTCGTTCITGTCATGTTAGGGGTTCTGAGCCCA 
TirGCTTGAGTATAAAGATAGTrcrATGGTAGTTGAGTCACAGAGAGGGCTGG 
AGGAGAGGGTTGCGAGAATGTCGAGAGAGAAGTGGGCATGGCATACAGAGCG 
GAAGAGTTAAGAATAGGCTTGCTACTCTGGTGCCACATrGGTATGACATCTGC 

■ CTTAG(XATTCGTTAiGTCCTTQCAGTAGAAGAAGCCCACCTrATCrTAAACCT 
GAGATGCAAAAAGTGAGTGATAAAGTAAATTAAAGAATGAGTATGTGTTTTG 
AAGTATGATGTTTAGAAAAGGGTTGAGGAAAGAGTGCrrrAGATTAtTAGATTG 
ATCAAAAAATGAATlTATGATACATGTrrTTGAGATGCGTATTATATAAAACA 
AAAGTGGTAAGACTGGGTATAACTGGTAGCAACAACAATACAAGTAGCrAGA 

• AGGAAAAGCATGGAGCAGGATGCGCAGGTGATrmTAATrrTATm 
TGGGGTGTCACTATGTTGCGGAGGGTGGTCTCAAAGTTGTGGAGTGAAGTGAT; 
CGTGCTAGCTCAAGCATCCCAAAGTGCrGGGATTACAGCTATTTTATATATAT 

. ATATATATATATATATATATATATATATATATATATATAGACACATATATACA 
CAGATATATATATAGAGATACACATATATATGTATATATATACATATATATAT 
ATACACAGAGATAATTTAGTTITGTGATGAGCCTTGTCTGCGTGATrAGAATA. 
TAAGATGCATGAAGTCAAGACrGGTGTGACAGATTTGTTTTTATCTTAGTGCG 

■ TGATATAGTTTGGATGTTATGCGAGGGAAATCTGATATTGAAATGTAATCGCG 
AATGTTGAAGGTGGGGGGTGTTGGGAGGTGATTGGATGAATGATGGGGGCCG 

. ATCTCTCAGGAATGATrTAGTACTTTCCCCTTTGTACTGTTCTCGTGATAGTGA 
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GTAAGTGCTCACAAGATCTGGTTGrrrAAAAGTGAGTGGCATCTCTCTCCAACi-; • 
TCTCTTGGTCCTGATTTCCTCATGTGATATGCCTGCnTCCATTTrGCGTT^ • 

catgattctaactitcctqaagccttcccagaagcratgcttcctgtatttgc^^ 
ttctgccatgattgtaagtttcctgaagctttcccagaagctgatgcagaagc . • 
tatgctttctgtacagcctgcagaagcatgagcccattaaacctcmtcttc • 
.ttaattactcagtctcaattatttatattagcaatgcaagaacagagtaatac • 
agtgcttagcccagatcttacacataataggtattcaatgtacactgtgtgtt 
gttgatattcacaaattactrcctctgtttctacaaaatgttgaataatacctc ■ ' 
eatcaoaaaaacctgggttaaagtaagggtatrmgtctatctgcaaaaag 
ataaacatatrct^\.ttmctataatgattgcgggtagagataatctctgcc ■; 
ctcaacaacactcttccctcatagagggaaatgaaactacaaatgtatttga 

ATATAATAATAGTGAAGGAAATAATGTATGCTGTGGTCCXjTTTCGAAGACAA ■ 

agtgccttgaatcggtttaggtcagcaaaccacagaagaaataggatatact. . 
aggcccctgcttggatagccaatgcctgcttgtcaccacttccccttagttgc •• . 
ccrcacccaaaccaaagaagtttagtctgaaatgaaagcttactagcctgca 
aaatagctcgtmttctgttcttattagcctacccagctacttaggtcataag. 
•tcaaatacttgagttcctaagctaactaggattgcaatgtattgtgggctgca 
acaaaatgcagcaggacaaccctaaagaaaacacctaaagccactacccaa 
caaccgataggcaatgtccaggaagactgtgaccccatagtactcagcctgt . 
gaggaaccgggggaagggacctgtgcattagggaataaattgctttttgtaa 
ctgtgctgggtgtgcctgcccaccggacagccaatcttgcaagaccatcacg 
aaaaatctcacritractgtrctctgggtctcrgagtccattctttgggcttgg 
atggtgagttrgtttctcacaatagcaacaaatgcaacaacaacagaagcta 
atttitattgggcacttactatatgccaggatctgcttaaagcactttacatg 
tgttagcttattcaatccraaaaataattcttctaatcacatgcctccacatt 
gtcttaagaaactcatcctgtgttcaaaagctggataattttccaattttaca . 
gaatcaggttgacatactctacaatcctagtcagcatgataaagtgacactc 
. atacattcatttgaaagactttagggaaatagttactacattggcacagaga' ■ 
tgtggtgcctcaactctgtcatgaaatraggacttgtatgtgttacaagaaga 
ggtgtggatgaactaaagaaattgtttctatgagtagagatrttgaaacaga 
gagtccactggatcccaagtcactgctgtgtgaactcactcacgacacggga 
attctccaagtaccatcctgcctgactcattaatcttatgaagcacagagtga 
tcacacatgcccctgaaatgactgtatgtaaagtaaattcaggctgtaacac , 
acaaagirttcaaggttggcctcataggactgtagattcccccagcagatgg 
■ gcagaggaaggaaacttactctgtctgagattctctcgtatttccagggcaa 

CAAATCATGCGTAATGAAAACAAAGCAAAGTCAGTACCAGCCCAGGGCCAG 

ccatcaccccacccaaaccagaagggcaggagcctaattcatgaaatgtgcr 
• gtgcttcttcctccggccagccaggctcggggtttcctgatgtgttcctggaa 
ccagagctaatggaatcaggaaagcatgttactttgccactgccagtcattg 
caagtacaacaaaatAaatattgctttaaagaaaacaattatcataaagaca 
' attagtaatgaaaacagttatgccttcctttgtttttgagacagggtctcact 
ctgtcacccaggctggagtacagtggcacaatgtctgctcactgcaacctccg 
cctccccattcaagcaatntcgtgtctcgacctcccgagtagttgggactac. 

. AGGTATGCATCACCACATCTGGCTAATTrTTGTATTTTTTGGTAGAGACGGGG 

TTTCAACCATGTTGGTCAGGGTGGTCTCGAACTGCTGATCTCAAGGGATCAGC 
CCACrTCAGCTTTTCAAAGTGCTGGGATTACCAGCGTGAGCCACCGTGCCCGA • 
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CCCAGTTCCGCCTTTCTAAATTGGCCTCTTAATATmAGAACATTT 

CTGGCCTTGAGTGAAGAATAGAAACTACAGAGGGAAGGATTTGGAGTGGCTA 

ATGTTGGCAGAAGTGAGAATCAGAATTATGGAACTGCAAAGTCCTATGACCT 

TCCAmAetGAAGAGGAAACAGAAGCAGAGCAAGAGTGCTCAAGAGACrrA 

CCTAATGCCACTCCACACAGTAAGTACTGGAAtCCGGGACTTGQACTGCCAA ' 

TTCCATGTGCnTrCATTTGTGACATTACITrTTTTT^^ 

ATGTAATGTTTCAATAAAATTTAAAATTTTGGTTAAAAATCACCTATA/^TCAT 

AACCTTCTGATAACTATTATCATTOTGTATGTTTCOT 

•TACATTrAACATAATCAtAGCCATTAACCATATGTGTTTCrGTTmrnGm 

GTTTGTTTGTTTGtlTrTGAGACAGAGTCTCGCTCTGTTGCCCAGGCTGGAGT. 

GCGGTGGCGTGATTTTGGCTTACTGCA^CGTCGGACTCTCAGGTTCAAeGGGT. 

TCAAGCAATTCTCCTiSTCTCAGCCTCCCGAGTAGTCGGGACTACAGGCGCCTA 

CCACCATGCCTGGCTAATTTn'GTATTmAGCAGAGACGAGATTTCACCATA 

•TTGCTCAGACCGGTCTCAAACTCCrGACCAAAACGAAGTGTTTTTTTAACC^^ 

CAATTTAAGTAAATGTITAATGTTTATTrATAATITAriTATTAAATGm 

TTeACCAGGCACTATTCTAAATAAAACAATGTTGCTGTCCTCACATAATGAAG 

ATTCTGGAGGGGTCAACATAGAATACGTACATAACCAAGTAATGTACAACAT 

TCCACCATGATGAAGATAATTTGTTCCTAAATAAAGGTCCTCGTGGGTGAATT 

CTCATAGGCTGAAAATGTACCTAACATTCATTTTGACAGAAAAGACACrCirr 

TCTCTTAAGCCCCCAAATrAACACCTATTTATGTTAAAATAACATCAATCCCA 

TTAAAACAGGTACAATTATCTCAAAGGTAAATGGTTATCATAGGACTGATGT 

CTGTGCTCATAAAGCATCAAAGCAACCATATGATTCTCTGCGTTTATGTAATr 

AA^TGTTTAATGAAAAAACAAAAACAAAAACATGAGCTCTmTTGTGGCAC 

CTTGGAGGCAATTAGCTGCTTCAGGATGAAGCTAAATATCTCCTCCCCAGCCA 

CTGGCTGACAGACACTCATTGGATTGGACAACGAATGGCAATTTTGTACTTAT 

GAGAAGCATATGGCACAGAGGTTGCTGCCGACGCTCTGGAAGAGTTATGTGG 

TCCQAGTCAGTGGTGGGACCAACGAACAAGGTTTCCTCATGAAGCAGTGTGT 

CCTGACCCATGGCTGGGTCTGCCTACTGCTACTGAGTAGGGGCATTCCTGTTA 

TAGACCAAGGAGAACTGGAGAAAGAAAGTGCGAAtCTGTTTGGGGTTTCATT 

GTGGATGCCAATCTGAGCAITCrCCAGTTGGTTATCTTAAAAAAAAAAAAAA 

AAAAGGAGGGAAGGATACTCCTGGACTGACTGATACTATGGTGTCATGTTGC 

TGGGGGCCCAAAAGAGCTAACAGAAATGCAAACTTTTGAGTCTCTCTAAAGA 

AGACAATGCCCGCCACGCATATGTTGCCAGAAAGCCCTTAAACAAAGAGAGT ' 

AAGATAGTGAGGACCAAAGCACCCAAGATTCAGCATCTTGTTACTCTACATG 

TCCrGCAATACAAACACTGGTATTGCTCAGAAGAAACAGCATACTAAGAAAA 

ATAAGAAAGAGCCACAAAATATGTTAAACTTTTGGCCAAGAGAATGAAGGA 

GACTAAAGAAAAACTCCAGGAGCAGGTTACCAGAGACACAGGTTGTCCCCTC 

tgtaagcttcraotgtgagtctagtcaaaaataagattttttgacttactta 

acaaatgagtaagatcataccacccccagcaaaataatcaataaagactaag 

gacattgAtgaaaatgatggagaagatgatgacttccatgagtactttaaag 

gagaaacttaaagcaacitgtataatttttaagactgcagagctgacatgtg 

gtaacttatgctgaatctgctcagtagcttcccacagcccccaaatctgaatc 

acattccttcaatgtgaagaatgcctcataagctaaagttgaggaactaaat 

AAA.CTGATTAAGTCTCTGTCTTATGACACCTCCCACCCCCTATCACAGTTCTA 

tccagggcagcattgactagcgtaatagcagggtgtgatatgtggtaggaca 

GATAGGATGGGGAGGACAGGGAATGATGTTAGGTGGAGATGGGGGAATGAC 
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. AAGGAGTCTGGTGTGAGGGAGCCCACTGAGGGAGGGCTAGCtTGGTAGAAGC 

jaatgctggecaggtagctaggtgctagattagagagcaccttgctagccac' . ■ 

■ actgaagacctgagtactttggaagccagtggaaacttctgagcaaggagtg 

■ acacaatcagactcatatgtgagaaggagactgtgggtgagggagataagtt 
"aggtgacrattgtggggttcaactgaaagaggaggtaaaagcatgggtcag. 

■ ggttacaatggt(3cagacaga(3agaaatagttatagtttatctgtagtttga ■ 
atataaaaccaaccggattccctttgaattgaatatgatacatcagggaaaa ■ ' 

• agaggaatcaatcttagtgcatactctatgtcaggcgttgttccratgtttta . 

• catatatrcataacaataaaaataatatcaaatgcctrrctaatacgttccct ' 

• atgccagattctatatgctttaatctactttcacccacatcacccttactgca : 
atcceatagggtaggcactattatcatttccattttctgatgagaaactgagg 

■■■ cacagacaggaagggcaatgcacacgggaaatgctggagctgattttagac .. 
cccagacgttctggctgcagagttgagattcacaatcactgcaaacactgca 
ttccatggtcacaaggttttggtccgagcagttgagtgggtgatgttgctatc 
gaccaagacaggaatacctgaggcagagcaggtttgggggagtggatagcg 
aaggcctgggtttttggccatgctaagtttgggat.gcctgttagacctcta^ • 
tggaaatgttgtgtggccatagggtgtgaagaatctggagtttaggaaagga 
gccagaactggagatagctgagctatatggaatgatcagctcataaacagaa 
cttaaagctgcgggacaggatgaaagtactgaggaagactaagtcct 
ctccagcatttacagtccagaaataggaggagctccagcaaaagagattggg 
aaggagtgacctgtaaggctggaggaaaccaggagagtgtgctggaaggta ' 
aaaaatttctaggagggaatgatccacrctgtgaaatgctgctgagaagtcc 
agcaaagggaggatgctgccagatgtcatgatgcccagtttaattgttggat " 
acaacttcttcaagtggaagaattctctitttatctactcttgcatm 
ccttcacatatagctcacaaagtagaggaagagagctcatctaacttcaacg ■ 
tgaagttgttaatrtgaattcagtttaaatattrattgggtgtcaiagtatggt. 

• ' actaggccacagagattcaggggtaagtgaatcacAgagttcttgcttttttt 
ttttttttagacaaggtctcgctctgtcacccaggctggaatacagtggcaat 
cacggttcacggcaacrrctatctcctggggctcaagcaatcttcccacctaa 
gcctaccaaqtagctgggattataggtacatgccaccacatgggtatttattt ' 
tattttttgtagagatggggtcttgccatgtggcccaggctggtctctaactc 
ctggcctcaagtgattttcccaccttggcctcccaatgtgctgggattatagg 

■ CATGAGTCACTATGCTGGGAGGGTTCTTGCTCTAGACAATGGACTAAGCAAT 

• GAGAAGGAGGAGGGGAGGGAAAATAGGAGGAGGAGAATGATGAGGAGGGG 
TGCTGGGGAGGAGGGGAAAGGAGAGGAGAGGGAGAAGGAAAGGAAAGGGA 
GAAGGGAGGAGGAAGAAGTCGAGAAAGAAGAATACCATGATAATATTAATA 
TTAACATTTGGAGGTTGCCTAATGAACTCAGCCAGAACACCCCATAGAAAAG 

cccaatggttgacacacctccgdtaacaggccataaagtactagggggaga 
ataaaggctttttcacatatgctcattctcaaaataatgatctcctgtgcatg 
ctttctcaggacgctactGgatgatatgcacggccaagacaagggagtatcc 
aaagaagaaacatgagattcaggcaataaagaatcagtaggatacaggcca . 

• gaataattcccaccatattttaaactgagatcccaagacaaacactgtgcac 
caggtctagataacaatcagtccacattgattgccrgattttcaaaacatatt 
aagtggaggtttacgctcttggggaagagtttgggaaggagattatggggga 
aggttgcagggggtgtttcatttttgattitgttrctgagtatgtagacattcc 
cltagtrccccagtgttcaatacagaggccgcctcacagttacatcagtgttc 
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CCTGTCCGGAGTCTCCTGTTAXGTTCTGtGAGTAAGACTTCAGTCTrCTGCT 
GGCTAGGGGAGGGACAGTTGCCCAGCTGCACAGGGTAGGCAAGGGGATAGG 
TGAAATGTCGCTTCATACAGATTTGCAACCAACTCTGtTTTTTAAGTCTCTATC • • 
CCAACATAGGCCCTG(:TrCAAGAGCTATCACAACATCCAATTCCTGCn"CCnTr- 
TGGGGCTTCTGAQATACAAATAAGCCTGCTTCTAAGACCrfCTCATGCTGCT 
TCTTGAATTTCAGCtTTCTCrGCTtTrcrAAGTGAGTCTC^ ' 
CnTTTGGATCCCAGAATrAtATGAGTGTTTTGTTTTTrGTCT ' 

atccttatgactagagtcatttitgtgaagtrtggaagggaacaaaaggtaa 
aatatgtgctcaatctgccacactgtttaaaaagtggttttctttttAaatta- • 
ccaaatatatgcatacagtttaatatatattataaagtgaatgAtcattaacc ■ • 
aacactcaaatcaagaaataaacattgcragerrccctaaagccccccatat 
. gtccccaotgattacaactagttctctccctttagatgtgacaccatcctac . 
cttitatgattactgcttccttctmccitracagt^ 
attgtgaaacaActgagtttagctitgcctgcttttgaattitatgta^ 
agacatatggtgcatattcttttgtgtttcgtttcrmactca 
attcaaccacatcatcattacatatatccacntggttatacacggataccac ' 
agactatccatgttactgtggatgtacttctgagttgttrctagtttggagta 
aatcttaatgctatgaatattcttgtacacactctttgtgcacatatacacac 
• gtxrcatttggtatgccacaagaagtggaaattcitgtggcacagtgcataaa 
catcttgatccttctagatattatccgttgttttccacagtgcrtgtagcag^ . 
gtctatgcaagttcccagtgctccacgtttgtgccaacattggtattgtctga 
cttcagtggtggctgtacaacctaacgtgttcagctggaaataccttgttaag 
tirrccatgtaccttctatcgcttcatcactgcacttttggaaagagcttctaa 
aactcctttcttatagagggcagacccataaiggcaacctacatgttccattgt 
irmcttagagaagatttcaggtagagcttctgacaacctgctccaattggg 
actagch'gctcactaatrctgcctcaatgatccgtttcctgaatctcatgtctr 
tttttttttetgcatttttcccgttttcatag 

tgagaaatgatgtgtaaggaaatactatttttgagaccitgcatattrgaaa 

atatctttattcrccactcatagtaaatagtitgggtagagaattttaggttg 

gaaataatttttcataaaaaacttrgaaggcatttttattt^ 

gatgttgctgttgagaagtctgatgccattctgatttctgatctttgtatatg . 

acctattttttm"cccrctcratggaagcttttaiggattgtctctgtc 

acatttcacaacagaagtgaattgttaaattcactacacacttggtgagcact 

ttcagtatgaagaatatgttcitaagattrttggaaattttcctgaatm 

ttgataattgccitccctccactctttctctagaattccragagtcatgtttrc 

aacctccagcattaaacctctaattttctttrctttccactt^ 

ctgggtAtttitgttccactttatagaagctttcctcaatt^ 

crtccactaattctgctcctatattttttagtttccaagagctcctrcttatt 

mtatgtgacattttaataggaaactcttcacatttaatgaaagcaaaattg 

■ TCCTCCTTTTTTAGAAAATGGTAACTAGTTTTAAAATGllll'CTm 
CATTTACTGXirrCTCAATGTTTCTtlTACAGTTTATT^ 
TGTGAAAGGTTTTTTCCAAGTGGCTAGTGATCCCTAACTGACCAGTTCATGTG 
AAGGGGAGAGGCACCATGAGCTAATCAGAAACTTATGCGGAGGGGGGACAT 
ACAGATTAGAGGGTCTCACTTTAGGATAACCAGGTGGGAGCCCTGTCTTTTTA 
TAGCCTCTAAATGCCAGTTTITGTTTTGTTTTGTTITITGTCTA 
TGCITGGTTTTTCrAGAAAGGAATTCTCTCCAATCCTGCCTAAAATAGfTGTAA 
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GTCn:GGTATTTAGCATTCTGGGAGCCrGGTAGGGGAAAGGGTGGATGGGTCA 

GGGATGGTGGAGTGTTGTCTCACTCITCATTAGGCGAA(nTCr 

TTTTCCAGTAAAATGCCTrGCTarAGCCCnTAGCTGTGACTAGT.GC 

CCAGAGTTTTTTGGTTTTTGTTTAGTTTAACCTCTCCAGAAAGGGTATCirrA 

TmCTGTCAAGAAGAAGGGAGd-AGCAGTTACCJGGCTGCCCAGTCTAGGAG ' 

AGGGGAAAGGATGTGGTCTCTAAGAACTGCGTAtATGAGTCOTGGGT^^ 

ATrCTACCTCATATCTCTGCCTTTAGAGGTATACAGCATATCTGATTTTGGAT 

ATTTCTTAGGGGATGTATTAGTCCGTTrTCACGCTGCrGATAAAGACATACCC ' 

TAGACTGGGTAATTCATACAGMAAAiGAGACTTAATGGGCTCACAGTTCCAC. 

GTGGCTGGGGAGGCCTCACAA.TCATGGTGGAAGGTGAAAGGCAGGTCTTACA 

tggcccgagggagaatgagagagaaagagagagagagagaaagagagaga 

GAGAATCAAGCAAAA>GG<3GTTrCCCCTTATAAA^iCCATCAGATTrCTTGAGA ' 
CITATrCACTAACATGAGAACAGTATGGGGGAACrGTCCCTATGATTCAATTA 
TCTCTCACCAGGTCCCTGCCACAACATGTGGGAATTATCAGAGCTACAATTCA . 

agatgagatttgggtggggactcagccaaaccatatcagggagcactAtagc 

ACAGATTGTTTTGTTCTTGTTGCCTTTCATCrmATGGTATTT^^ • 
' GAA.GGCTAAAATTGAAGTTAATTTCCACTTGTTTGtCTGCATTCATrCTTC ' 
AA^TTGTATTGAATACAGTCGATAAATTGTATTTATCTGCAGTCACCCCTTGT 
CTCCTTCTCTTTGTCCGTATAAGCTA^ACACCTTTTTTATTCCTTTAATGCCAT 
mAGTGGAGTATATACAATCCATACATTITCCTAGGTATTTTrCTTTCTTCAT 
TCATACTTTCTATATCCAAAAGAGGATTTGAGCtTGTTGCAATAAAA.TATACA 
TATGCGAAAAA.GTTGAAATTTGGGAAAAGTAAAAAATATCAAGTAGTAAAA 
GAAAGGAAACACCTGTGGTGGAAATCTAGGCTAAGGATATACkjCCGTGACtG 
TACAAA.GGTTGGCCCTTACTAAACACCTTGGCAGTTCTGCTAAAAGGAGGAA 
CAGGAGGAATTTCTCAGCCCTCATTATCTAACAGGAAGCACTCCAGGGCATC 
AGATAAAACTTCTGCTAAGTTTACTGAGTGAGTTGATTCTGTAACTGAATAAA 
AGTTaTGGTGCTCCAACItGGAATTGATTCTACTGAAGAGATTAGAGTGAGA 
CCCAGATAAGAAAAATAAATAAATAAAGATAGATGTGAGGGGTAGGGATGC ' 
TAA.GTCITGATTGATTGGCATCTTTCCCAACTCAA.GCACTGTGATGACACCAT 
.CTGTTGTTACTCAGTCrATATTCCCAATTTATTCAGGGTTrCTAGGTGGAA 
ACATTAAGAAAGTACCCTGTAGAGGAATTCTTATAGTCTTCTCTTTCTTTTCC 
ATTTCAACAGGATATTCTCAGAGCCCTCTAGCAACTGTGACTGTGATTTCAAG 
GCAAGAAGTAAACAGACAGTAGGGAATTATGTTAGGAGTATATATTCCTTTA 
CTTTrCCTCTCAAAGAGAGAGAAAATAGGCTTTTTTTTTTTCCTGAT^ 
GIGTATGAGTAAGCCtAGATTCAGGGCCCAATAAGATCATGCTCAGATTTTC . 
■ AAATrACAGTTTTAGAAATTITGGGGGAAATTCTTGATGGCAAA/VGGCATTGT 
• GAAATAGTATCAATACAATTGCAGGTTTAGCATTTTCTTTATGCAATAAACAG 
TACAGGGCTCTGGGGCTTTACAAAGATGAGCAAGCAACAGTCCCTTTTCCCA" 
AGGAGCTTACAGTGTCTCACAGGAGATAATAAAATGACACAAGTGATTAATA 

gaggagagagcttcagaatgtctagccagctacacatgtctagccacctgcc 
tacatggagacccagtccccttttcccacccacaaggtgaatctgggaagcc 
. acaaacaggaaccctgcccttctaactgcagctgattaaattagaggtcaag 
agatacgtgacctaaggaaa/^ccaattggatttcctctccaaa.gatttaaaa 
tragaattcaaaggtgctaatcagtcrctgctgtccactgatttaagggcgta 
gaa.gcactggcttggatatttctgaccaggcaccagtggcaatgcagagaaa 
acacatctggagagaaagaggcatgcagagtggctttccagtcctgcttttg 
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GCCTCATGGGATCCATGAGATACTCCCATATTAAAAGTTTTGATtCAAAC^ • 

TGTAAfACAATACTACTGTGAAGACTATGAAATATATGCTAAGGGGGCTCAA" 

AGAGGGAGAGAAAGCACATATTATTGTGGAGCAGGTGGGATTAGAAATAGT 

GA/U^TATTACATGAAAAATTTTTACGACTGAGTTCACOTAAATGATGGGT^ 

AGATTCTAACAGGAGGAGATAGGTTGAGAGCATTAATTAGTGAAAGAACCAT 

ATGGGGGAGAGAGAAACCAGGTGTGTATGTTCCAAGGTGCCTGGGGGTGGTT 

AGGTTTTATGATGAAAATTAAATAGCTOTGGAGTAGATTCTAtCTTTGCATC' 

AGCTATGAiCGCAAGTCAGCATATTCTAGGTCCTTTCrCCACTr.GGAAAGAATT 

ACTGCCAATTATCAGTCCATTTCCATTXjGCTtCCCTCTAACTAOT 

AAAAAATGAAAAGTTCATTTATTCCTATAGTTCTATAAAAGAAATCTACTCAA- 

AAAGATGTGAAATGACTTATAATGCAATACTGTAATTTTTTATATAAAGTTCA 

TGTrrGTIT.CCTTGTTACAGGATAAGTGGTAAGTAAATATTGCCTAGTAATGT 

GACATGAGTAACAGAAAATACAAACTTATTTTCGCCCTAGGGAAGCCTC3CTT' 

ACTTTTCTTGACCGTCTTCTCAATATCTATCTATAATCTTCAGATATAGCAAAG 

GGCCAGCAACCACTTTTCTGGGAAAAAAAGATATTTTGCCAAACTTTGAAAA 

CACACCAAAATATGGGACTGAAAAATAGTGCATATATATCAATTGAGTGeAG 

TGGTCTGATCATAGTTCACTGCAGCCrCGAACTCTTGGCCTTAAGGGATCCTC 

CTGCCTCAGCCTCACAAAATGCrGGGATTACrGGTGGCTTGCCCTATtGTTTA 

AACTAACATTTTTGATAAAATACTAAATGTGAATATCItCCAAAACTTGAAAG 

AACTATGCAGTTATAAAGCATTATAAAAATAGGCATATTAGATATTTTTATAT 

G TTTIT AAGTTCATTGATTAGCGAGAGGAATAAAACTGAACTCAGTAGAAAA • 

GTTrTGGAGAGAAACAAAAAAGTGAGGATTTTTACCTTATAGCTAACATTAT 

CrACCrCATTTAGAGAAGGATCTTGTTTTTATACTATAATCCTTTTAGACA^^ 

AAGCCAATGAAATTTTAAATTCAAAGGCAACrCAAATGATTCTTGACAAGGG 

TGACAAGAimTrCAATGGAAAAGGGTAGTATTTTAAGCAAATAGTACTAGG 

AAAACTGAATATCTACATGCAGAAGAATCAAGTTGGACCCTTACCTAACACT- ' 

GTATACAAAAATTAACrCAAAATGGACCAAAGACTTTTTAAGACCTAAAATG 

ATACAATTCTTAGAAGAAAACATAGGTCAAGTCTTGAAGATATTAGAGTTGG 

CAATGATTTCTTGGATATGACACCAAAGGAACAGGGCACAAAAGTCAATAAA 

TTGGATTGCATAATGATTTAAAAATTTTGTGCATCAAAAGACACTATCAACAG 

AGGAAAATGATAACCCACAAAATGGGAGAAAATATTGACCAACCATATACCT 

gataagcgattaatatcqagaatatgtagacaaatcrracaattcaacaaaa 
aacaatttaaaatgg(k:aaaatacttaataaacacttctccaaagaAgatat 

QCAAATAGCAATAAGCACATGAAAAGGTGCCCAACATCACTAATTATTAGTG 
. AAATGCAAATCAAAACTACAAGATACCACCTCACACCCATAGGATGGCTACr 
ATTTTTTTTAAAAAGAAAATAACAAGTGCTGACAAGGGTGTGGAGAAATTAG- 
AATGCTTGTGCACTGTTGGTGGGAATGTGAAATGTTACAGCCACTGTGGAAA 
ACAGTATGGCAGTTCTTGAAAAAAAAAATAGAATTAGTGTATGATCCAGCAA 

trccacttttaggtatatgcrcaaaagaatagaaagtaaggatttatgaaac" 

atttgtatattcatgttctagcagcattattctcaatagcaaaaacatggaag 

taAccgaagtgtccactgacagatgaatggataagcaaaatgtggtatatcc- 

atacaatggaatataactcagtettaaaaaggaaggagattctgacctatiac 

tacaatgtggatgAatcttgagagtattatgctaagttaaataaactagtca . 

caagaagacaaatgcrgtatgattacacttatatgaggtattttgagtattca 

aaaccacagagacaaagtagaatggtggttgtagggggttaggggatgggt 

caatggagagttagtgtttaatggatatagaatttaagttttacaagatgaa" 
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GAGTtATGGAGTrGGACGTTGGTGATGGTTGCATGACATTATGAA.TGTGTTTA 
ATGCT ACTGAATTGTACACTITAAAATGGTTAAGATAGTAGATTrCATCTTAT ' 
GTGTAOTACCACAATGAAAAATATTGGGAAGGCCAGGTGCGGTGGCTCACG ■ 
CCTATAATCCCAGCACTTtGGGAGGCTGAGGTGTGTGGATCACCTGAGGTCA. 
GGAGTTCGAGACCAGCCTGGCCAAAATGGTGAAACCCCATCTCTACTAAAAA . ' 
TACAAAAATTAGCCAGGCATGGTGGCAGGCACCTGTAGTCCCAOTATTTGG 
GAGCCn-GAGGCAGGAGAATGGGTTGAACCCGAGAGGCGAAGGCTGTAGTGA • 

■ CCCAGGATCGCACCATTGCACTGCAGCCTGGGTGACAiaAGAGATACTCCATC ; 
TCAAAAAAAAAGGGGGGGGGGAAAGTAAGTCATATTTCAAAATATACTCAA 
GAAATATTTTGTTCAAAGGATCTGAGCAGCCAAGTAAGGGGCAGTCCACTGT • 
GTCAAGATGCCATACCGTTCTGGTGCACTGTGTGTCtCtCCATTCTGGGTCTG 
GCTCTGCCTrCCACTTCACCGTGCTTCTGCCTCAGTGGTCCTGGTGCCAGTGG- . . 
TCTrCTAAGTAGCTCrcTAGAAGAGAACAGCTGTTTGCCACAACAAGGGAAA . 

• AACtGGAGACtATCAACTTGAAGTAGAGAAGGTAACAGTAGGAACTGGTTAT ' 
TGTCATAGTGTTTTTAAATCTGAGAACTTrTAACTTTGCCAACXAAGGATGAT, 
TGATGGGGAAAGAGTAAAAtATCTGGGAAACTGGTCTCTGAGAAAGAAATTT . 

■ GTTCGGTGGAGGTTGTGGGTGAGATGGATGCGGCAGAGACTGCTACTATCCC 
CTAGGTTCCATTCTCTCCTTCTTCmAGAAATAGGACCTrCAGTTTITAGTTC " . 

• ago atgtggtcctctggtacaaaggctacatttcccagtctcccttgcaacta • 
catgtggtcatgtgactaagttttgtccaatgggatgtaagtggaagtgcctt 
gtaAaattgatggtaa^u^gtccttaaagggcatagctatgccccatcagccc • 
ritcctttttcatactacagtggaagtccaatgtgaggatagcagaactgtta 

AGAdAGGAGTTTTAATCCCAAACAAimTCTATGGCCTCATCCCCTGAeCTAC 

■ TCCAATCCTGGGCTAeCTAtTTCTGTACTTTAAATTGAAAGAGAATACATTTG 
CATCTTGTTTAAGCTATTGTTATTTTGGTTTCTGTCACTCAGCCAAACTCAATA 
CTACirGAtAAAATTGGTAAAAAAAAAATGATAAATTAGAAAACTGCTTTAT 
Gr(n"AATGAAGCTTrAAAATAAAAAGTACTTCCTCATATGGGTTCTCTGTTTr , 
TCCTCCAAAATGTCTATGAAGACACAGAAAAAGAGGAAATTTACAACATTAG. 
AAATTAGGACCAGTGCAACTTAGTrGAGAGAATmCTAATAATTCCTTCTTT 
TGTCTATTCATn-GrTCAGCATGTTTATTAAAAAGTACCCCTCATGTGCTAGG 

: . CTGTTCTAGGCCCAGGGAATATAGTGGTGAACAAAACAGATAAAGTTCTTGT 

■ CrrCATGGACCTTCTATTCTAGTGGGAGAACAGAGACCACCATCAAGATAAA, 
AAAAATAAATATAATGTCAGTTTGTATATGTAGATGAAGAAAAACAAAGCAA 
AGGAAAGGAGTAGAAAATGATAGACACAAATTGTGAATTGAGTGGTCAGAG 
AGGGCCTCTTGGAGGAGGGTGACATCTGAGCAGATCCCTGAATGAAATGACA 
GTGGGAGTTCTGGTGAtATCTGGAAGAAGAGCATTCAAAGCAGCGAGAACA 

■ ACATGTGCAAAGGCCCTGAGACAGAAACAGGCTTGGCAGATTCCAGAAATG 
GTAAGGAAGACGATGTGCTTAGAGAAGAGTAAGTGGATGAAAAGAAGTGGT 
AAGAAGAGATGTCAGAGCTTGTCAGGGGACAGACAGTGTAGGATAAATTGA 
CAGGAGATAAGTTGATTAAAAATCATACTTrGTGTCCACrrCTGAAAAGAAAA 
TGTTQTGAAATGATGGGAAGACCTCTTGTCCCTCCATTACAATCTACAATGGG 
TTCAGAATCATAACACCTACTCTCATGAGAGAGATGGAGTATTAGTCCATTTT 

■ CATGCTGCTGATAAAGACATACTTGAGACrGGGTAATTTATAAAGAAAAAG^ 
GGTrTAATGGACTCACAGTrCCATGTGGCTAGGGAGGCCTCTCAATCATGGTG 
GAAGGTAAAAGGCATGTCTTACATGGTGGCAGACAAGAGAGAATGAGAGCC 
AAGTGAAAGGGGTTTCCCCTTATAAAACCATCAGATeTeATGAGACTTTTTCA 
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ccaccacaagaactgtAtgOgggaaaccacccccttgattcAattatctcct^ 

CTGGGTCCCTCCCATAACACATGGGAATTATGGGAGCTATAATTGAAAATGA 
GATTTGGGAGAGGACACAGCCAAACCATATCATTCTACTCCTGGCCCCTCCC 
•AAATCTCATGTGCrCACATTTCAAAAGCAGTCATGCCnTCCCGACAGTCCCTC • ' 
AAAGTCTTATTrCAGCATTAACTCAAAAGTCCACTGTCTAAAGTGTCATCTGA 

■ GACAAGGCAAGTCCCTTCCACCTATGAGCCTGTAAAATCAAGAAGCAAGTCA . 
•GTTACTTCCTAGATACAATGGGGGTACTGGCATTGGATAAAtATACCCATTCC 
AAATGGGAGAAATTGACCAAATAAAGGAGCTAAAGGCCCCATGCAAGTCCA 
AAATCCAGTGGGGCTGTCAAATOTAAAACTCCCAAATGAtCTTTTTTG 
CATGTTTCACATGCAGGTCACACTGGTGCAAGAGGTGGGTTCCCATGGTCTA • 
AGGGAGCrCCACCTCTGTGGCTTTGCAGGGTACAGCCTCTCTCTTGGCTGCTT 
:TCACAGG(^GGCATTGTCTGTGGCTrTCTCAGGCACATGGCACAAGTTGTTGG'. 
TGGATCTACAATTCrGGGGTCTGCAGGATGGTGGCCCTTTrCTCACAGT^ • 
CTGCGCAGTGCCCCCGTGGGGACTCTGTGTGGTGGCGTCAACCCCACATTrCC 
CTTCTGCACTGCCXn'AGCAGGTGTrCTCCATGA(jGGCCCTGCCCCTGCA<^ 

acttcrgcctggacatcgagatgttttcatgcatctctgaaatctaggcagag . 
gttcccaaacctcaattcttgacttctgtgcacaAgcaggcacaacaccacat 
ggtagctgccaaAgcttggggcttgcaccctctgaagccatagcccaagctg • 
taccttggccccttctagccatggctggagcagctagaacacaagacaccaa 

GTCCCTAGGCrAGACACAGCAGGGGGTCCTGGGCCTGACCCACAAACCATTT 
TTCCTCCTAGGCCTCrGGGCCTGTGATGGGAAAGGCTGCTGCAAAGTrCTCTG 
.ACATGGCCTGGAGACATTTTCCCCATTGTCriTGGAGATTAACATTTGGTTCCT 
CATTACTrATGCAAATTrCTGCAGCAGGCTTGAGTTTCTCCCCAGAAAATGGG 
TirrrCCTTTCTATTGCATCATCAGGCTGTAAATTTTCCAAACrm 
GCTTCCTTTTTAAAACTGAATGCTTTCAACGTCACCCAAGTCACCrCTTGAAT 

gctttgctgcttagAtatttcttctaccagataccctagatcatcttcctcaag 

ttcaaagttccacaaatctccagggcaggggcaaaatgccatcagtctcttt 

gctaaaacataggaagagtcaccttcacitcagttcccaacaagtttctgAtc 

TCCATCTGAGAGGACCrCAGGCTGGATTTGATTGTCCATATCATTATCAGCAT 

TirrGTCAAAGGGGTTCAAGAAGTCTTTAGGAAGTTGGAAATTTGGGCACATT 

TTCCrGTGTTGGGAGTCCTCGAAACTGTTGCAAGGTCTGTCTGTTGCCAAGTTG 

CAAAGTCTCTrCCACATmCAGGTATTTTtAGAGCAGCACTAC 

GTAGAAACTrACrGTATTAGTGGATrTTCAGAGTGGTAATAAAGACAAACTCG 

AGAGTGGGTAATTTATAAAGAAAAAGAGGTTTAATGGACTGAGAGTTGCACA 

GGGCTGGGGAGACCTGACAATCAGAAGGAGAACGGTATGGGGGAAACCACC 

TGAATGATTCAATTATGTCCCACTAGGTACCTCGGACAACATGTGGGAATTAT 

GGGAGCTAGAATTGAAAATGAGATTTGGGTGGGGAGAGAGCCCAAGGATATC 

AGTTGGGGAGGGAGGCATGTGTTGITGGGTGATTTTTACGAAAGTGGATGTTT 

crctgtacaatgagaggctrgcacctccagagaatgtgttatcttaatcctct 
ggctggatgcttatgaggatagagtaaagtgggaatgcggagtgaatgttgg 
..gaggggtgatggttagaaataaaggagtagagtttggtagagatggaagaag 
trcagatgcaqtggatgcrgtaagggatgggtntatgatggtatttcatggt 
gatagtgtgagaacgcgagagagagggaacaggtgacaatattgtgtctgtc 

■ ■ caaaagagtgatagtaaatggaaatctggttatgatgctattgagttaggta 

CTAAACTTTGGTATGCGATGGAGTTGTAAAACGTAGAGCCGCAAAGAAGAGA 
ATAATAAGATAGAAAACAACACAACTGATCTTTTQGCTAACACTAAGTCTGG 



p33 



33/107 



wo 03/104427 



PCT/US03/18239 



aaattacccttccatttgagatgattgccagaaaaatcatcaatatAtatcct 
atgaaaaactcctgaggggcagaggaatgggatcaatgcttcaaagactca . • 
gagaaagaaaatcactataaaaattatttgctcctagaatgtaaaaaAtatt" 
ttaagaaaactgcttggcattctaaaggtgaaaagactttagaatgagataa ' 
aagisgagaagattacnttta^^aattcagcgacattaaaaatgtaat 

AAGTAAAATCTACATCGGGAGTAGGAATGTAGACTCCTGTAGCATTACTATG ' 

acagacaccagagtcaatgaeatgaaaqacaaacttaagatcttttcaggta • 

tgtggtaaaagaaaaaaggattcaagtgaatgatatagaggactggggctg . 

gggxtggtgggtcaggtctgtaatcccagcacttcgggaggccaagatgggc 

ggatcacgaggtcaggagatagggaccatcctggccaacatggtgaaaccct 

gtctctactgaaaatacaaaaatcagccgggcgtggtggtgtgcacctgtag 

tctcaqcractcaggaggctgaggcaggaggatcgcttgaacctgggaggtg 

gaggttgt agtgagccaagcttacaccgctgcactccagcctgggcaac aga • • 

GTGAGACTACATCTCAAAAAAAAAAAAAAAAAAAGAAGAAGAATATAGAGG 

acrggatagatttatttaagaattatacatttctgaggaaggtactagaaga 
attagaacaacAataattaaagatatcactttaaaaaacaagacctgactat 
tcaaattgaaaggactcaccattctagatgatagtaatgaaaagagggctaf 
tctagaaatagcctggcaaaggtttgggaatgcaagtataaagaaaaaaatg. 
acatctattagtattctgtcataaagcaagtaatttatgaggaaacaaaaat 
ctagctgttctcagatagctctccagtaagAaatgccagaagacaacagaaa 
aaatacttaaaattctttgaagaaaaagactatgacctAagaattttctcag 
ctgaattttttttctcatatgtgaagacaacaaatgaacaatcccatatattc 
. aaaggctcaggaaatagagcattcatgtattcttcatgaaaatattatttgg . 
agacatacatcagacacccgaaagatgtgaaagaagaatagggtatggatt 
acacatagttcatggcattatgtaaatgttataaagcaagatgacatagctt 
gggaaaacaaaagctgtgtctaatagcagtacnccaagccataacttacag 
tagcccaattctcaataaattggaaggcagctaaacaatcatacagtgctag 
tattttatagtgtcagggtctattcacatataatcxcatttaacccattttaat 
. tccgtgagataaaaaccaatat.ccccatctaagatatgggaaactaagacgt 
agaagaaagcacttggctaagatatcatggctcgtaggtggcagtcaagagg 
tcagtttgcagtctacagatitaacctcagactattct.gcttctaacatgact 
atagaaatgtattgatcattagctgcctgaagttctgttcttaactctaggtg 
tccaaaagaagatgaattttgtttagatagcattttcccatatctagggctgt 
tggctttaaagaagttcccagaagtgaatccaattcccagaaggatcaaggt 
ggattcttttgtgtgttacrctaacaggtrgctttatatatatatataagttat 
ataatatacacatataatttatgtattatatataagtAtgtataacatcatat 
• .gtaatatataaatatagtacatttcacactggtaggacgtaattccaaacca 
■ catatttgtttaaccaattgaattacitggattragtrtctatttccctctt^ 
mccrtctgaaaatattatacaaataacatcagtttacaaaaaataaaatct 
gatatagcttttttatcatacactaggctagactaaatgcattctgtggattg 
tttacctaggaccagatggtatattataaagttgtatacaaatgaacaagga 
ctgcctgaaatggatrgatagctgagcagatttggctggagcgtctattttag 
aaggaaaacctgagaatcatattatttgagtgtgattcatgtgttaatagtat 
ttcaagacaagccacttaaaatAtgtcctgagtggtgatgctggaaatgatc, 
..tmcttcagtgttttcaagtgtcttcatttaagtgtacacattttgcctacct 
ataacacgtatctacacattgtgattaagagagcaaactctgaaatcagacc 
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TAGATrAAAGATCTTAATCTCTGCCAGTTGATTGGGCAAGTGACTTAATCATT " 

CGACAGTAATTTCTGCATTTGAAAATGCCTATCTCAAGGGTAAtGAGAATTAC 

ACTGGTTCACCAAAAGAATTACACTGACTAATATAAAACGTGCCAAGCATGf 

ATGTGTCA CATGAAGGCTCATTAAAAAGTGGATATTATTGTTAATCnTCCAAT 

AACTACTATTTCCAACAACAGGCTGAAGGGGCTCAGAAACGTTTG TTGAGTA' 

;AAAACAGAAG GAAu^ AGTAGCACAGATTTCCTGCrcn'CCIT^ 

ACCTGTCrAAGGACTGTGA fCTrrrGT^ CGCTACAGATTGTCACCTGCATr^^ 
.CTACTGtCACGCATTAACCTATCAAATAAGGCAGTCTAAA^ . 
CggTTrrgGGTAAGGAGCCGGACfGTTGAACTGGAAAGCTAAAATTCA^ 




TACAAACTGGGAATTACCAGCACCTACGAGGtAAGGGGGAGGGTGGGCTGG 
GAGGATGGAGGGGGGGGGGGAAGGGGAGTGTGGAGAGCTGTGGAGGCGTTTT 
CTTGGGGGCAGAGTGTCACTGCCACATGGAAATGTGTGGGGGTCGATCGTGAT 
CATCACGCTGCCTTCAGTCGTCCGGGTCGGGGGATGTTGAGGTGGGATCTTTGC 
CITCTACTGAGAGGCGTCGTCTAAGGGTGAAAAAATTCTTGGGATTTACTCTC 
CTGGGGTTAGTGAAAAAAAGAGGCTTGGAAAGTGAACGGATTGCAACAGTAG 
TGCTGGGTATGGTGCTrTGTACGTTTAGCATCTTTGATTCGGAGGGAAGGGGA 
AAGATTTITGGGGAAGGTAAGTTCTTCAGGTCTCAGGCCGTGCTTCTT 
, GAATAGTGTTGTTGC AGGTGCTGAGGGGGATTGACTGTTGGAAGATAGTTGGT 
AAAGAATTTGTAGTCGCCriTCCn'GCGGTTCGACACGGAGGGGCTTTCCAGTGA 
AACAGGCACCCAAGTGGCTAAGGTGTAGGAGTACCTGTATTTCGGAGCAGAT 
TCTAGGAGTTAGTAGGTGGGTGAGCTtGGGTGAGTrACCGAACCTTTTGTGCC 
TATTGGTGAAAAATAAAATAATATGAGGTAGCTGATGGAGTTGGGAGGATTAA 
ATGAGATGATGGAAGGTGTAGGATTTAGAATAGAGTTTGAGACAGAGTAAGT 
GCCAGATAGGTATTGGCCA^TAATTACTGTGGTGGTGCTAGTTGTGATGGTGGT 
AGtrATGTTGGGATGAGTAGGAAAATTAGGGAGCAAGGTTGTGAAAGAGCTG-: 
TGCGTTCTGTATGGAAAGAGAATTAAGGAAAACTGTCTTTGGTGTATrGCACA 
GGTCGCGTGGCTATAGGTmCTCAAGTTGTCTGCAAGAAAGGGGTTrGAAAA 
GATGATGGACGnTAGTTTAAAGAGGTGTTGAATGGGAGGTGGGGGCGGGAT 
CAAAGGTGAGGGGAGTGGGGGGGGGGAGTTGAGTGGGTTGAGTGGTCTCGGT 
CAGGGAGGACTGGTCGTTGAGGTACATGGGAGAAGGGTGGAGAAGTTTCCTCT 
. CTATTAGGCGATTACGCAGAGGACTGCAATGGTGAGATTGTGGCAGGTGGTTT 
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CCTAGTGAGGCTGGTTCAACTGGTGCTGGGGACAGGGCCTTAACGGCACAAC 
AAACATTGGTGACACACTCTGCCTGGGCTCTAGTGACATCTTCGAGGGCGAG • • 
GCACCCCTGGGCAGCATTGCCTTCCATTCACTCTGGGGCTGTGTTTGGGAGAG 
CTCAATAATTATCCCAGAA.GTGAGCAGGATGCAGTTTCrCAGGTTTGCTCAGA 
AGCGACAAACTGGGGAAATTCCCCAGACCAAATTCACTTCn'CAGTCATTTm 
TTTTTTACTTCTCCTTATAGTGAGAAGAAATAAAATATAATTTCAGACTAT 
CTTCTTTCGAGAACTCAAAAGGCTGGTAACACTGGGCCCATGCTCCGTGTGGC 
CACAGGAACTCAAGTTTATAGCTATTGCTCTeACTCTCTATTAAACTTCTATA . 
TTGAAGTCTTATCTTTGTGiTTTCATGGTrACTGTTTTTTTGGTAGAGAA^ •• ' 
TTATCTGCATTTATGTCTTTATCAAAAAATGTGAAAGGCAAAATGGACAG(3G 
GTAAAGGGATCTCTTGTTTC.CAGGGAGATGTAAGAAAACATATATCTTAGTG 
GGTGGGAGGAATATTCCAACATGTTAATATGTGAACACCTGGCCACTGTCAC 
.TTATGTATGTTACCTGATGGAACCTGGAGGCATCTGAGTCTGTGAGCTGTGAT 
CTATAGCTAACCAATTCCCAAACTTGGAAAGGGTTTGAAGAATCTGGCAGGA " 
GGACTTCAGTCCCCACAGATGAATGCTGCTCTCATAACCACTGACCCATGCCT 
CTTCGGGGCTACAACATTAA.CTCAAGCAGAAACAGATTTeCCTCATGCTTGCA 
GTGGTAGAGGCTAAGTTAGCAGGGGTTCCAGCTAAAAATGTCAAAAAGCAA 
AAGTTCATTGAATGCTGAACACATGCCAGGCACTGGGCTGCATTGCATGTTTT 
ACATACATTATCTGACTCACATTCAGAACAATCCTTATGAGGCAAGTACTATe 
AAGAGCCCCATTTTGTGGACAGAGACTCAGAGTAAGTTAAGCTACTAGGAAT 
GTGAAGCAGCGTAGACATATGCCTAGCAGCATACTGGCTGCTTCTGAAGGCT 
CTGATGAGTCTGTTATAACCCCAAGCAATGCTTACGCCAGTGCAGAGAGAAC 
AAGGGAAGCAATATAGGACACGTGAAGTTGTGGGCACACTGCAGGCATCCA • 
CCTCGAGGGTGCCAGGCAGTTGGGAACGGGGAAGTTAGGCAACATGCTAAG 
ACCCTCCCAATTGGTCATGGCAAAGACTTTCCTGTGGAGTATATTGTTTATTG 
TTTTATGCTGAGAGGTACAGAATCTATACAGAATATGdTGTACAGTGTTGAGT 
GATTGATACCCTGGCATTGTTGCTCAAGTATGAAATCTCTCCCTGTTGACATT 
TCirrAGTGAAAATTAGCTGTTCACAAGCATTITIT^ 
CAAAATGTAGTCCTGATAGGTGTTTAGCAACTTCCAGACCCAAGTATAATGC 
AGGAATCCAATTTAATCCACCAATTCTrCAAGATCCTTCTGGAGCCAACTAAA' 
TCTGTGTTTTAAATTGAGTTGCTTAGACCiTATGTTTCTITGTTATTTATATATT 
TATTCATATCAACAGAAGCAATTCAAAGTTCTGAGGACCACGAGAGATGTTG 
ATtAATTTAGGCACCTATCATTACTCTGTAATTATGTCTAGAAGTCTTACAAG 
TATTCCCTAAGGCTGCTGCACAGTGACTTGGGCATCTGGTCCAGTCTCCGGGT 
ATTTCTTCAGCAGCCTAGTAATTGATGGCTGAGATCGTAATAGACTTCTTGGT 

tctgtcttgagttggtccttgggotaaatcacaggtttccctggatcagtt^ 

ATTAACTTACTTGCITCAGTTTGGTTCAAmATTTGGGTCTCCAGCAATITGG 
ATTCATGTACCTAGAAAAAA.CAAAGACTTCTCTTCTATTGATGTCCATTCACA 
TGTTTAGTGATTCATAGAGATTTACAAAACAAACAAACCAAAAATAAACTTC 

•.TAAAGAAATATAATAATACTGCCTTGTTCCCAGACTTAATAAACnCTGeAGT 
CTTCTAAAAAGGCTTCCAAAATATCCCTGAATGATTTTGACACTTGTAATTTA 

. TTGAGATTCAATTTTCTCCAGGCCCATf GATTTGGATACCAATAGGTTCTTAA 
TATCITAClTlTrGAGTITACATTTCCCCATCCATTTATGTCACTGAATAGTCT 
CCAAACCCTGCAAAATGAGAGGGAAACAAATTATTATTATGCCCCTCATTTT 

. AAAGAGAAAGAACCTGACATTTGGAGAATGATTGCCACAAGCTGATGGAGTC 
AGGCGTGGGACTGGGTCAAGGGCTGGAACTCTTCCTCCAGTGTACTACAAGG 
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CACTGTGTATTTTTGTCACCGATCAGTGATCTGGGAGTGAGGTCTTACTCTOT 

GTTAGTTTCACCCATTCACTTATTGAGTCtTAAATATGTAGTCAGATTATTTAC 

TTATATATGCAGAAAAATGTGCACAGATmGGCATCAAAGGTTTTAAATGTG 

TACTGTGGCCCTGGCAAGTTTGCAAGGTTrTCAACTTGATGTTCTATCTGAGG- 

CCCAGTGTTClrrGTCTCtAACATGGGTTTATACACAGTACCrGGCT 

TTGTTGGGAGGACCATGCACACAGCAAGTACTCTGTATGTGTTAGCCATGATC 

ACGGGGCAAAAA AATCA GATGTCTTOCCTCTGATTATTACCTAATGCCAGTT 

mCCTTCCTCTTCTTTTCTACTCTATGAAGATTGTTGTCAG^ 

AGTAATTATATTCAAGTTAAAGAAAATCTTAGCrGCCAGCTCrGAAXTTCTA^ ■ 

GATCCrAACTCTGTAATATGTATATTATAATACATTATATAACATAGrAGACA 

AGCAGACTATGTTATAAGACATAGCAGATTTCAAGTAGAATAGGAAAAATTG 

GCATGGTCATTGGCACTGAACAGAAACTGACiTGGGTGATTTAAGm 

AAACAATTGTTAGAAGGACATGGGGGTAGCTCATAGAACAGAGGGACAGGC 

TGAGTAAGCAAGCCITG GAAAGG ACAGAAACCGGGGCAGCCtT^ 

CTCTGTCAGGATCAGTCATmTTGTCTITGTATCCTGCCTTri^ 

ACCCAATGGAAGGAAAAGCTGATTGGCTGAGCTGGGATCATGTGATTTCCTG 

ttggtcagaagaggcagggcaattggctgactattccatcacaatagtctgc' 
AgtcagaaagggttgatgccccaggtgagccAagaaaaatgtatggagtag 
ataccaagagaaagagaacaacagattgctcgerctaaaaatacagaaagt 

AATTTCATTTAtATGACATTCTGGAAAAGGCCAAACTAGAGAGGGAAAGCAG. 

atcagtgattgctagggtttggcgggggagaagagGgcttgatcatgaaggg 

GAAGCCCGAGGGAGTTTTCTGGGTGATGGAACTGTCTTGTGTCCTGATTGTGA 

caggggttacatgaatcagtgtgtgttaaaacccattgaactgtactctAaA 

ACAAAAGAGTCAATTTTGCTGTGTATAAATAAAAATAACACTAAAATAAAAA 

TACAGAATACAAATCAGTTATGAAGTTGCTITACA'mCTAAATTTAAATTTT 

CTTCTTGAGCTGCTGATTTTAAAAAAGGCATCCAGAGGATTCGCATAATTm 

TTTTTTGCAACAGTATTTAGAACTTCGATTTAACAAATGTGTTCGTCT 

AACAGTCCTCTTCA TTGTTC AAACAGTAAGTAGCTCTCCATTTATTTTATCTO 

TGTCAGCTAAAATGTTrTTAACCATGGCATCTGGATTAAGCTTACCTGGGAAT 

CACATAAAAGACAAAAAAGAGATGTTGAAAAAATGAGGGAACAAAAAAGG 

AAAAACACTTGTCACTGTACAGCATCAACCCTTAAGATCATCAAACAGTGTTT 

cagtaaatgcttractrcctgggatttaagtgaatgttaaatattatactgat 

aaagcaacagtcggaaaatatgcireccmatctgacctcccttcacctcca 

ccaaggtgacaggaagggcattactacatgacaaagatatttgttgctgaca 

gtgggaattctaacaaaaggaaaacaagtagcatgttcacagtatttctgta 

acatattaataggtatgaaaaaattaacttccataggaagacagtaggaaat 

aattttcjitrgaaatcgtacntraaaagtcagtgctcttt^ 

acaAgaaccaactaaagcagtccttagagtgtaaaacaacagaatcataaa 

ctctggaggactctttaaagccaaatactcaatccaGtaattcaagaacatg 

cgcttatagatctactgatIxjacacaaagggaaagcaaggatttgccaagtg 

• gtcagtgacagagaatgctttccactgttcaccgtgcttctggaagattgtaa 

• tgatcattgtcatgactatttatatacacattttcctcttgtcagttaagcact 
tragggctgggtttgctaatggaggttgtggaagagatttgcatrcttgtctc 

TAATTGCAATTCCACTTCTCCAATCAAAAGCTACCTAAGGGCCAGCCGCGGT 
GGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCAAGTGGATCACC 
TGAGGTCAGGAGTTCAAGACCAGCCTGGCCAACATGGTGAAACCCCATCTCT 
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acaaaaatacaaaaattagccaggcatgagggcaggtgcctgtaatccfcagc 
tactggggaggctaaggtgggagaatcacttgaacccaggaggcagaAgtt ■ 
gtagtgagccgagatcatgccattgcaccccagcctgggtgacagggcaaga . 
ctctgtttcaaaaaaaaaaaggaaaaaaagctgccttagga • • 

GAGACAGAGTGCCTTTGTAAATrATGTAACTTGACrCCATTTTATATCm • 

AAATTATATAACTTAAATTTTATCAGTCCrrAACAACTGCAGTGTAAAAGGAA 

GGAATCCTTTGGTGTCTCTTAGAGACTTGAGCCTGGTAGCTTGCATTCACCAA. 

CTGTTCAGAACCTCAtTGGATCTTTGTTAGAGATGCCAACAGAAATCAGAAG' 

TAGGGATAAIjTGTTAGGAAGGTGGCCTGTGGTCATGITm • • 

TGGACAGAATAATGACTGTGGAAAGTTAGTTCATTTTTGCAAAAAGAGGGGA ' 

GCTTTACCACCTCCCATTTGAAGGACTTCATAGCTCTAGTGATGTAATATAAT 

CAAACATTCAAAGGtACTGAATAGATlTITATrTXCATAATATGCTTTTAT^ . 

AATAATCATGGAATTTGGTTTTATGGGATATATTTGAAAGATCAAGTGCAATC ' ' 

AAAATTACATTTTGAGAAAAAGACCGTATTTATCTTACTCACTACTGTAGCCC 

AGATCAATAAATAGTTAAGTdTATGAATGAAAATAATGAATAATATTAAAAA ■ 

GATTGAATGTGGTTTTCATTGGCmCAGGATTTf ' 

TITGTTTAATTTATGAACAGTAAAAGTtrAGAAATAGGCrTTCCAA^ 

CTATTTTCI TGATTA ATTATGCAGGGATTAACAGITrGAAGACATAATTGAG 

GGTCATCCTCTTTTATATTATTATTATTTATTTTTTT 

ggaaAttttatttcaagcttgagttggtggatcagtgaccatttgcactaagc 
accatataaaagtccgtatttttacataagccggtcacaaaaaaatatttgta 
acttatgaccggtcataccgtaaacagaagagteaactttacttaaatatttt- 
gcaagttacaaacaaatmattaggtgttttgaaactgttgtmaagtc 

AATTGAAGTTATA<jGAAAACAATCAATATTTTATAACTCAGACGTAATTCAT 

gaattttataattcatataattcarnggcrttctttctcccccctagattctg 

tatactggaattgttatttatgcccctgccctggctttgaatcaaggtacat^ 

ttagagttgccagttaggtaactcacattttggggttcactttcaacaagcct 

tattttctccttggggagatggggagatggaggaatgcttctagtaacctgc 

atcagctttacttagcggggcaggatgggttcagtgtctacactagcttcttt 

gggtttttggatggacagctccaaagtgtctgtagaccacagaggtggactt 

.crccagg?tggtacttctctgggatgtgcccatcagcccattcactcctttaga 

attaaagctccctgtagccaaagtcaggattgacggcatcccctcttgtgaat 

•ctataacctggagcccatctccaggaagcttccctgtaattctgctcaggcct. 

gttgtaaaatctggcagggaisaaaagcttttcttctccacacttctaataagg. 

cccaaaatgaaagagaaagagagccatgtgatttgaatgatcagactgctcc . 

tagtgacaaaaggacagatgtctgtagtgcccttacaaataaatttaggaag. 

attgtgctgctcaacaaagtatctatactctctagatttgggaagataaatgc 

agtgaggctgggatagtttattgaagcaaacgttcatgctagtcatagtttca 

aaaggctitggggaaacaccatgccctttgaattcttatctattcgaagtgaa 

mctctaaaacgtccttgtaaaatggacatgtggactctgtgatgggaaga 

atgtggfacaattcctggggttaaaatgacatgaagaaaacctactaattcc 

acactctgttttcttgattttatgatagacatgacagtagttaccagctgtttc 

tgaagtgaacaatattattaccaagaggaacttcatgtgtaaggtgctcttg 

aactctgaattctgggcatgttccacatcggtattaccaacatcacaagtgga 

tcatctcatttgtcgagaactgagttataaacttcatgagttacctatttagg 

acttaagtgtaattgaacatattatggttttaAaatgcagttctgggaattac 
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CAGAGGACTGACTTTMTCTGTGAAGAAATCAAGCATTCTGTTOT 
TCAAACCACTCCCTGAGAGTAAATTAGAATGAAAGTGAAGGTGTTTTGATGA 
CTAGAGTGTCAGTTGTGTGTnTACTGTGACCAAAGCTGTAGGTACAGGCATA 
GACACAGAACTTAGTGTGCAATAGATGCTCAAAAAAGCTTATTTAACTGAAT . 
TGAAAATTAACATTCTCTAGGTGTAACCTGTTTTTTTCTT^^ 
AATCCTCATACAAATACTAAGAGAGAAtAGAGCCCCTTAATGTGCTCATTCA 
■TCACTCTATGTTAAT ATTTA TCAACTAATGGTCCATTTTAATTCACTCCCCCTT 
ACCTCTAATAGATTATTTTGCTGCAAATATGTTATGTCATITeATCTGTATTTC . 
AGTATGTCCTTTAGAAATAAAGACTTrAAAAACCATAATCATACCATCATTGT' 
ACCTAAAATAGTAATATTAATTTCTrAATTTCATATCAAGTCAGTGTTTACAT 
TTCCCTGATTTTITrTCAGTGTITATAAGAATCAGGATCCAAATAAGCTTA^ 
GATTGCAATGAGTTGATGTCTCTTAAGTTTCCCTTTTCTGCirmAA^ 
AAACCAGTTITTCTTrCTTTCTTTTrTCCTGACCGCCT^ ' 
ACTAGTTTTTATTGAGGTATAGTTAACATACAATAAAATGCACAGAGCATGG 
GGAAGTGTACAGTTGGATGAGTTTTAGTAGTTGCCTAATAGCTTGTGTGACTA 
CTACCCCGCTCAAGATATAGACTATTTCCGTCATCCACTCTCAGAAATTTTCC 
TTGCATGTCTITCTAACAAATCTCCTATGGGCCCAAGAAATCATTTTCTGAGT 
TATGCrACCATAAACTAGTTTTCCCTGTTGTTAGATTTTATATAAGTGGAATC 
/ATATAGAGCCTTTTCATGTCrGGTTTCTTTTGCTCAGCAAAATGTTTTGAGATT 
CATTCATGCTGCTACATATATATCAGTAGTTCATTCCC l ll' i i'ill l l'r i 'l l 'l'ri' 
.TTTTTTTTTTTTTTTTT^ 

GCAACATTTTTTTATCCATrCTCTTGTTATAGACTTGGGTTGTTTCCAGm 
GGCTATAGTGAAGTAAGGGTGCCAAGAACATTCTTCTAAAGGTTTTGTGCATT 
TTGTTTlTCTnTrC ACACCTGTTTTCAOTCC^ 

TGTAGTATAAAATTAAAAATATMTTCTATTTGGfACCCAGCTGAGGATAGAA 

ATGTGTGCAAATTTACAAGAAATTGCCAAACAGTTTTTCCAAGTGGCTATATC. 

GTTTTCCATGTCCACTACGGAATATGTGAGAATTCTAGTCATTCCACATGCTT 

GCCAGGATTTGATATTATTCTTTTTTATITCAGTCATTCTAGTG^ 

ATTATTTTATTGTTGTITCAGTTTrCACrTTCrATATGACrA^ 

TATTATTAGAGTATTATGGCTTAGGTTATTGGCTTCTGAAATAATAAAGTGAA 

TAAAGAGAAATAACCTAATAAAAAGTGAGGGAAAGAGTTGAACACTTCACAA 

AAGCAGATATTCTTCtAATACATTACATGATATATCTATTCATTTACCTAGGC 

GTTTAAAAATTtCTCTTAGTGGTCAGTTTGCkJAGCATGtAtAGTAAAACT 

AGAATAGAGAGAATATTAGCATGGGGGTTGGACAAAAATGATATGGAAATTT 

GTGAAGGATTCCATATtTTTTAAAAAAGGAAGAAAAGAAATAATAAAATAAA 

AAGTGtCTTAGTAATGTTTTGTAGCTTTGGATGITGAGGTCTTTTACrAT^ 

TAGATrTATTGCTAGATAATTGTTGTTTTTTATGCTATTGCAAATGTCATTGT^ 

TrATTTTTCAATTATTCGTTGCAAGTGTATAAAAATACAACTGAGTTTTATATG 

TTGATATTCCATCCTGAGAACTTGCrGAATTTGCTTTTTAAATCAAGCAGATT 

TTTGG TGATGCClTAGGATTTTCATGTTGTTrATGAATAGTGAGAATTrTAGTT 

Cll"iTGTAATGCTTATAGCTGTTAGTTCrrCTTTGCTGTCTTATTGTAATGTTAAT 

GtGGAAAGTTACATTGACTGAATTTTGAATATTAAACCAATTTTT^ 

GGGATAAAGCGTQCrrGCrCATAATGTATTATCCTTrTTATATATTACCGAATT- 

GGATTrGCTAACATTTTGATAAGGGTTGATAAATGAATTGAAAAGTGTTCCCT 

GGTGGATTTTCTGAAATAGTITATATAAAATTGTGTATTTATrTCTTACATGCT 

TGGTAGAAATCACCCTGAAGGCATCTGGGCCTACTCTTTGTTTTTGTAGAAAG 
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ATTTTGGTTTATGGATTCAGTTTATTTCGTAAATATAGAGTrATTCA^ 

AAAGTCTTGTTGGGTCAGTTrTGGTAAATTGTATTTTTTAGGGAATGOTCCA 

TTTGGTCTAAGTTATCAAATTCATTAGCATAAAACTGCATATAATAGCCTCTT 

GTTATGCTTTTAATGTTTATAGTATCTATAGTGACATCCCTTCTAATTATTCCT 

AATATCGATAAGTTTTCTATCTTCTTATOTGATATATCTTTCCAGTATTCTAT • 

TGATGTTTrAAATCTCTTCTAAGAATCAGTGTmGGCrCATTGATm 

TirrCTATTTCATTAATTTGTGCrATTTATTGTTTC^^ • 

GGTTTTCXAGCTTCTTAAGGTAGGCGCTTAGGTCATTGATTTTAAATATTTOT 
•ATTTCATAAAGCAAACATrTAAAGTTATTAATTTTTTTGAAACA ' 
CTGCAGCCCATATATTITGATACCTTGTGTTTTTATTTAGTTCTCAA^ 
. TCTAACTTCfCCTGTGATTrCTTTTGCCCATGGGTTATTTATAAGTGTAiTGCT •' 
TAATTTTCAATATTTGAGGATTTTCTATCATGTOTiTrGTTAm ' ■ 

ATTMTACTGCTTTGGTCAGAGAACTAATATGATTTTAATCATTfGAAATTTA " ' 
TGGATTTGTTTTATGGCCCAGGATCAGGTCTGTCTTGAACATTCCATATGCTC ' 
TCAAAAAGAATGTATATTGTGTGCTTlTrGGATGTAATGTTCTATAAGGTCAA 
AGGTTTAGTTGGCTAATAGJGTTATCACATCTITGATATCTn 
. TTTACTTGTTCTACCAATTGCTGAGAGAAGGATGCTAAAATCTCCAGATATAA 
TrATCGATTTTrCTATTGCTGCrATTrCTCTGTCAATTm 
GAGGCTCllTrAAAGTCTCITITrAAATT^ 

ATTTAAATTGCAAGATATTTGCAGAAGAAACTGGGTtAlrrGTTCTGTC^ 
TTTCCAGATTGTAGGTTTAACAATTACATCTGTATGGGGTCATTTAATGTrCA 
CCATTCCTCTGTATTTCTTGTAAACTGGTAGTAGTTTACTAGTAGTAGTTAGAT 
CTAGACrCTTTTTiCAGACTCCACnTCAACTTmGGAAATCATAm 
AGTATCTATAAGTTTATITCCTrCAGGAGTCCCAGAATGTCAlTTGAAAAATT 
ATCAGCAGCTATTGATGATCATTCTAGATTCATTATTTCATTGAGGCTTATAA 
AATGGAGTATTCTAGTTCTGTGAATTGTTCTTCATTAGTTAGATGGAATACTG 
CTATAAAGAGAAACTTCCrCTrATCAACTATTTGCTTTCTCTGAAAAATAGTT • 
TGTATAGGAAAGGGAGGATAAATGTCAGATTTTTCCCCTTATGAATTTTAAAA 
ATAATAAATTGGTTTCCTACTATCCCCCAAGGGTGACTGAGATTTTTAAAAAA 
TTCTrACGAACTeATGGATTTAAAGGTITGCTGTTATTTCAT^ 
• AGTATTATTATTGAtGCTCAAATTGtCCCATTTITATCCAGCCGGAGCCCCCrr 
AAGTTCCTCCTGATTTTAGCCTTGGCTGCATATTGGACTCACTTGCGGGGTGG 
GGGAGCTTTCAAAAATACCAATACCTCAGTGCCCATCACCAGAACTCTGAGT 
TAATTAGGCTGTTGAGTGGCCCAGTTATCAGGATTCCATAAATTTCCTATGTT 
GATTTTAAGGTGCAGTCACAATTGAGAGCCATGGATGTAATAGTTTTAGAAT 
AGCAACTCTAAGCCATAATGAATAATATAATGGCCTAACACAGTTAAAAATA 
TTATTCTTtGITGTTCTrTTTrCTrGGGTGTATCrCAGTAGGAATGTATGTr^ 
ATTACTGTGTTAAATATTTTGTGACATGTTTCCTCTGTGTGGTAATGTCACAA: 
ACITGATATACAGTTAGGTrCCATTTATTTCATmGCAATTGAtlTTGAGGAG 
TlTmill'CTAATTTAACmATATTTATGTGGAATATTrATGTGTrCCAAAG • 
TGAAATCrATACATCAAAATATGTTTAAAGTAGCGTGACTTGTATCTCTGTCT 
TCrCrAGCCTGTTTTCTCCCrCTCCTAGGAGTAATTTTTTGGCT^ 
TTtAATmAGTATATGTACrGCTGAAACAAGCACAATTCTTITCTT^ 
AAATGAAATAAAGTCCCCACTTCTTAGATAAATAGGAGCAAATTATAAAGAC 
TTTTCTCCATCrAGAtCCCATTTTGGTAGCACATATCTTAGTAACGCCTTCTTC' 
AAGGACTGGTGAAATTGCGTGTTTACATGGACACACAGAGGAGAACAACAC , 
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•ACAGTGGGGCCTATTGOAGGGTGGAAGGTGGGAGGAGGGAGAGG^^ 

AAAACAACAAATGGGTTAATGGGTACTAGGCTTGATACCTGGGTGATGAAAT 

AATCTGTACAAGGAACCCCCATGACACAAGTTCACCTATGTAACAAACCTGC 

AGATGTACATCTGAACTTAAAAGTTTTTTAAAAAGCATTACCGGCCAGGCGC 

GGTGGCTCACGCTTGTAATCCCAGCACTTTGGGAGGCTAAGGCAGGTC^^ ' ' 

ACGAAGTCAGGCGATTGAGAGCATCCTGTCCAACATGGTGAAACCCGTCTCr- 

ACTAAAAGTACAAAAAATTAGCTGGGCATGGTGCTGCGTGCCTGTAGTCCCA 

GCTACrCGAGAGGCTGAGTCAGGTGAATTGCTTGAACCCAGGAGGTGGAGAT' 

TGGAGTGAGGTGAGATGAGAGGAGTGGAGTCGAGGGTGGTAACAAAGGAAGA • 

OTCGGGAAAAAAAAAAAAAAAAAAAAAGGAATTAGGATTACCrTTATTAAA 

TITCTGAAATCA<}ATCCGCAATCTGCAATGlTATAAAGATrACATTTGAATGC 

TrrGGTGTTGTGTTGATTTGAATGTATACATGATAGCAATTTGATATTGTCATT 

. atattgtgagtatgatagattAtcgtgaaattgaggtatttgcaagattacat • 
aaaaatagagtgtttatttgaagcgggtctaatggggagagagataagaaat 

ACCCATC^C^GTATCAAGGAGTrrCTTAGCTTAATTACAATATAAACGCCtTT • 

CGATGTATTGAATTAAGGGAGTGTCAGATGTGTGTGTCTGAGAGTAGAAAATT 

ATCAAATACAATTTTAAACTGCATTTGTTTTTGAGACATTTATGAGATTTGGA 

GCTAGliTTAGCTTTAGGCAAGTGGGTATAGAGGAAGGTGGGTGATAAATGA 

TATAGCTTCGTTCTCTGATGATTTTTGAGGATTATTTTTATTGTGCATAm 

CGGGGTAGTGGCAAGGGGGGTGGTGTGGAGATTCTACTGCA^TG^^ 
CCAGGAGATATTTCGGTTTTGACTCTACCCACTTGCTTTGCAAAATTGAAAAT 
TGCAGTTGTTGTATAGGGGAATCTTGTTTGTGGAGAGTTAGTCTCGTATTGCCA 
AGCTCGTGCAGTGTAAGriTTmGTGTGAGAAATAATCTGTAGTATAATTTG 

.ATCCrTrGTAGAAAATGGAGCATAACTGAAATTTTTrCITITATGTGATCGA^ 
GGAATGGTTCGTAATCTTGTGTGTTGATTCATGTGTAGTGTTGCGAAGTrATTA 
TATTTGTGCTGATACTATTGCTCAGCTTAAAAAGGTATCATAGGTCATAGGCA 
CAGTGGGTGCGACGTGTAATACCAGCAA'nTGGGAGGCGGAGGTGGAAGGAT 

■ CAGTGGAGCGGAGGAATTTGAGACCAGCCTGGGCAACAAAGTGAGACTCTGT 
CTCTAGAAAAAAACAAAAAAAATTtnTAAAATTAGGAGGCATGGTGGCATA 
CAGCTGTAGTGCGAGCTACTCAGGAGGCTGAGGTGGGAGGATCGCGTGAGCC 
CAGGAATTTGAGGCtGCAGTGAGCTGTGATTGCACCACCGCACTCCCGCTTG 

' . GGGAGGAAAGAAAACGGTGTGTGAAAACAAAAAGAAATGTGTGATAACTTGC 
CCTTGCTTATAGGTAGGGGTGAGAAAATATGGAAAATCAGTATGGGATGGCA 
CTGATATTATTATGTGTGCTGGTCGGAGGCAGTGAGAAAATGTTATGGAATGC 
TTGACAGTCTATGTCTTAACGCTTTAGGTGCTCGATCTTGTAGAAGATGGGGC 
CCTGAGGCAGGTGGTGGTTCTGCTTAGGTCATAGAGAAAGTTGGGAAATTTGT 
TGTGGTGGCTAGTAGAGATCACTTATAGTATATAACAGTGGAAATGACGATC 
TGAGTGAAAACCAATTAAATATCACCTCTGCCTGTAGGATAAAATTGATATTA 
TITACrAGTAATTTGAGTCGKlATTTTrCrTCCGAACAATGTCC 

. CACITIATTCTGTGGAATGTATCT GGCCC CATTGAGGATGCCATGTAGTTTCA 
CTCCTtrATGTTmGCCCATGCTGTTTTCTGTGGTGGGGGGAACCTCGATCTT 

.TGGGTCTGCrrATGTTTGAAGACTTGATTTAAATATATCCTTCTGTATGAAGAT 
TGTGGGAAGCTGAGTGTTTCCCTTGAGTGCAGTTCTTAGGGAACATTGATGTTT 
ATATGATTTTTCTGGAGTATAGTGTATAAGTGTGTAGATGCTTAATAATGTCT 
AATAGAAGCATTAATCCTAATATACTTTCCCCTCAAAGGGTGGTCTTAAAGCA 



• p41 



41/107 



wo 03/104427 



PCT/US03/18239 



GTTATCtGGACAGATGlTTlTCAAATTGGGATCATGGTGGCTGGATTT • 

•CGTGATTATACAGGCTGTGGTGATGCAAGGTGGAATCAGCAGTATTTTAAAT 

GATGCCTATGATGGTGGAAGATTAAATTTCTGGAAGTAAGTGTCTAGTACTTG 

GGTAACTGAACACAtCTTTrGTAXTCTATAAAAATAATCTCtTTATTGAAATA 

:GTAGATrTACATTAAAAAACAAGCCAACAAATTGCTAAGGATGTGGTAGAGC 

AAATTGAAGCAGAGAAGTAAATACGTAAGGAGCCTCCCTCTGTTCTTTAAGG 

GATTAAACCTGTCAGATGGTACTTAGCTACATGGTGCTTAGAGCAATGTTtCC 

TTCTGAGAAGGGACTTAAAGCAAAAAAAGTATTTTCTTCCAGGTAtTAAAGT • 

CGAGAATAGGTTGAAAAGTGGGACAGGGTGATAAGGAAAGAGACAGTGGAA 

AGTTAAGAAAAGGCAGCTTCTGGCCAGGCACAGTGGCTCACACCTGTAATCC 

CAGCACTTTGGGAGGCCAAGGTGGGTGGATCACCTGAGGCCAGGAGTTCGAG " 

ACCAGCCTGGCCAACATGGCGAAACCCCATGTCTACTAAAAATACAAAAAAT 

TAGCCAGGTGTGGTGGCAGGCACCTGTAATCCCAGTTGCTtGGGAAGCTGAG 

GCAGGATAATTGCTTGAACCCAGGAGGGAGAGGTTGCAGTGAGCCGAGATCA 

cgccactgcacttcagcgtgtgcaacagagtgagactctgtctcaaaaaaaa 

aaaaaaaagagaaaaggtagcctcttaagagacaaacactagcatattgga. 

ttagcacctgccataaaaaaaaaaaagagacagagacagacactgaaagac 

agtgatttcattttgacaattctttgttttaagcaacmgagagtttct 

tgatattgccctggcaatctatagtatattaacatagtagcttatgattatgt 

attataattctatgtatgtgtatgtattatataatgaacattatctaatatga 

agtattatttctagagaattgactaggaaagttacagtctatgcttcaaatga 

cagtacagcacatgatgtaacatgttttaagcagttattcagttttcctaaag 

aaagaacaatgaagagactagatatttcatcccaagtatatcacacaattca. 

AAAGAATATTGAAAATGCCTTCCGTTTTGGATCAATAGTGTTGTCCTTTTGCA 
ATATGGAAAGGGACAACCCATGTTGTCTTGAACTAGCCTATCTTCTCTTTGAG 
ATCGCAGCCCCGGTTrAACATAGGCAGTTrGAAGAAAAAAAACCCATTTTGC 
ACTTGGTGGCTCTTTTCTGGTCrrCTGAAAATAAGCACCAAAGTTTGAGAAAA 
AGGrrTTTCAGAAATTGGACAGGGTCAATGTGTrAATTTACAGGGATAAATTT 
TAGTGAATCAACTTTGACTATTTTCAATAtTTCTTTCCTTCTTT^ • 
TrCAGATn:CTAAGGAAAAGAGGCTCTAGTTGCCTAGCTTGGACCTTGGTTCCA 
CCCCTTGGCCAGGGCAGAACAAGATATTTTGACTGATCGTCAATTGAGACTA 
■TITAATGGAGAAATGGTAGTITCCCCAAAGCAAAC(^GGGTT 
AACAGGGAGAGGGGAAGTCCAGGAAGTGGAAACAACCAATGACTACTCTTG 
AtTGCTCCCAAATCTCTTCCTGGTAGGCCAGGCTGAGGAGAGAGGGTATGGA 
. ACCAATCATITITCTGCAAGATAGCTGTCACTTTATATGAAGGATACATATAT 
. TGGAGCACAGAAGTGATAGCTTATACACATTAGTAAGAGATTTTTAAAAAAA 
(3ATrrAAACATTTTTATGACCTTAGTTTTGAAAGTCATTAAGTAACAATAA>^ 
GCCTTATTTGTGTTTCATACTTTTCAAGAGTATCCOTGTTATTC^ 
TTGAGAATATGATAGACAAAGTATTGGTCAAGTrAAGATAAATGAAGAGAAG 
AAGGATATTGTATTGGTATTTTCTGTGTCGTCTGCCATTGGCCTATGTATTCTG 
TATGGCCTAAACCACTAAAGTGAGTGTTTTAAATCTTGTCATACTTCCCTCAG 
ATAGCTTACTGACTCCnTnGGCrCTTCAGTTAGCTAATATAGTAGCTTCTCTC 
TCTTGGGGAGAAAGGGTCAACAGTCTTGAAACTCTTACACTTTGATATGAAA 
TGTTAGACATGAAATAGAGGCTCTCACATTTCCACAAGAATGCCAGAATACA 
. " CATACAATTAGAGCACTTAACTGATCTCAAATATAAGATTTGAAATGAATTTG 
CAAAATTTCATAATTTTAAGGAAGTCTCCATGAAAGCTAACTTTGCAGAAAG 
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ttttctagctcatattgtatatcagagatgaacaaaatcattCcttcc^^ • 

aaaagattaaaagatctttcaaacatggaagttgagctttccctgtgaeaat 

gttttggacttatgtacatgatgtcataaagtggctttaaactattagtam 

agcttgcacgcacagcattttaataaagcatgattacaatgacatgcagtm 

tagaaaatgcaaaacttgaaactgcattaattgactgatrrgttgaacgtga 

atatagattagaactatgattacatgtgctgggagaggaattctata'cacaa • " 

aatgtgttgatttcgtgttcatitragtgccatgtcgtctgctgtggtgcactr 

AGGAATGTtmCrCTtrCTCTTTCTCTATGTITGATGTTOT 

CATGTCAGATTTGtlTAAATGACAGTTTCCTTAGATCACAGGAGAACAAGTTA. 

TATAAATGCTCAGAAGGATTGCAACATtCATTTCCCCCAGTCTCTTCTAGTAT- 

TITCAGATCTGTGAAAATAGTATTTmATTTCATTGAACTTAAGTO 

ATTTCrACGTTTrAAATTAATrAATAAATTTTTACATATTTA^^ 

GTATATAAATGTGCATTGCTTGCTTCCAAATATAGTCTAAGAGAATTIT^ 

AGTCTCTGCTAATAGrrCAGATTTGTTTTCTGtlTCTTGTCATTTAGTm " 

CCTAACCCTiTGCAAAGACACACCTTCTGGACAATTATTATAGGAGGGACCTT. 

CACATGGACCAGCATCTACGGTGTCAACCAAtCCCAGGTGCAGAGATATATT 

TCTTGTAAAAGCAGATTCCAGGCAAAACTGTAAGTCACACACCATGGTATAT 

GAATCATTAATAGCCATCAGTTGTCTTTATGGAAACTCTTTCATAAGTCACGT • 

TrAGCTCCTTTATGTCTTTTGTGTCATGAAATTCCAAGAGATAAATGATATAT • 

TTGGTTACAAAAGGACCAAGAACAAATTGTTACTGTATGTTTTAAATCAGGTA 

TTTGAAAATATITATAGGAATTATTAATGAAAACAAATCAGCATTTATTGTGC 

ATCTGTTTGCATAAGAACCCATGGGGGATAAATATATATGGCTTTTGTCTGAA 

GAAAATTAAGCTCAAATTAGGAAAATACATGCAAGTATGTGAAAAAATAAA 

ATAATACTGGAGATATGTAACAGTTTCATGAAATGATATAGCAATAGAGAAA 

ATGTTCCAGAGTACATATAGtGCmATATTTCAItTTCAGACACAGTCATGG 

ATTTATAACCCTGCTCTAATATTTGCTATTTGAGTGATCCTGGGCACATTCTGC 

AGTCrCTTTGAGCrGCAGTTTCTAAACITGTAAAAGGAGCATAAGAAATACT 

ATACACCTCATGGGAGGTGTCG TAAAAT ACTGTGTTTGGCATACGCATGGTA 

AAGGCCTAGTATATGTAAGTTCCrmTCCACTTCAACTGATGTGATGTGAAG 

GTGGAGGATGGATGAGAGATTCCTATTGAACTGGCAGAATGAATATGAAGAA 

TGATCATAATTTGGGCAGGTGGAGAAGAGTAGGAAGGCATTCTAGGTAGGTG 

GAGTGATTTGGATGAACGCACAGAGGTAGGAAGGACAGCATGGTCCAGAAC 

TCTCAAGCCCATCITGGCTTGAGCAAAGAGTGTGGAATGGTGGCTGGTTCCTT 

• ctctggagqcctcacatcactgcagtgctcaccactccctgtgctgataaata. 
ggcrgactctgcrgacatgcrtttaggggctttaaagctctaaggagattgaa 
taagagtccagggtggatagggtgagggaaggaggacaattctAtttctatt 
taaaggagaaaatgaaaatattgtgatgtcatatgtcagaacfcaatatatt 
gagagtaaattggtttagagattacctaaatctttgaagaactcacagtgaa 
aaatctatatgaatatggatAtaataaccagatatgggatccccgcaaaatg 

AACITAGGAATTCTGGT ATtr TACGGAATATTGGAGGAATGTAATTTATAGAT 
AGAGTAGTCTATTGTACnTTAAATCTATACTAGAGCTGATACCTCTCCCAGA 

• GTAATTGTGGACAAGCTCTTCTrCrCTTCCCTTGGGAACATGAAACCCAAAAA 
■ GCCCGACTTGATAACTTAAATAGCAGCATTAGAGCTTTCTTGATAAAAATCA 

ATTCCCATAAGCATAGTGGATCTCCTGCATGAAGCAGACTCTCGAAACCCAG 
GGGACTCATTITCGTCTGCCTTCTCACAiCTTACGCTATGGTAAGAATGAATCC 
CTGCrAAAAAACCAAGACCrrACTGACATTGATTGACAAAAGACAGTATGAGA 
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TGTtGATAAGTGGTCAATCTGAATAGCATCAAAGTGAAATAAAACAATATAf • 
TAAACTCATATATAAGAGACAGGAGGTGTITGGACAAGAGAGTATCTGGGCA 
•TAATTTTCTGGTGATTTGGAGATGAGCTAGCAATAGCAAAACATACtGCAAT ' 
GTTAGTCAATTCAAGGGAGAAGGATAAATCTGATAAACCCACCTTTGATTCTT ' 
CCGGAAGCAGACTCTGAGGTTGAGGTTAGCATGGAAGAGTTTATCAGAGAAt . 
•CGCTATATGTAGAAGGGA^GAAAAGGAAACAGGATTGACGAGAGGGAGACG 
. CTGGACTGTAATGCAGTCTGAACAGAGGCCTCAAAGGGAGCTCTAGAGCTAG • 
GGTGGCTCTTCAGACTAGTCeCATGTTAGGGTGAGGGGACTGGGCCrTTAAA 
CCTCTGAATCCATTTGTCATTGGATGCAGACn'CCTGCGGGGGTrGGG^ 
GGAGGCATGACTTCAGGGATGGTACCTCTTTTCCACCTCGGGCAGCT.CATCAC" 
TQTCCACTACAGCACCAGTGGAAAAAATATGTAGCCATCTTAGCAAGAGAAA " 
TGTTTACTATTCAACTGATtATTAAGACATAGGATTCAATAACACTAACACTA • 
ATATCAATAACTAGTATTTAACAGGGGTTTATTATGTGTAAGTACCATGCTAT 
ATGATAATATACTGTTTCATTTAGmTAtAACTTTGTGACATAGGCATTGCTA ' 
TCCTACACTTCAGTGAAGAAACTGAAGATCAGAGAGGTTGAATTACTGGCCC 
AGGGTCACTTATGGTACAGCCAGGATTTATGCTCAGGACTGGCTCCAGTGCT 

gtgtgtaaacctttattctctactggAgcacacacagtatattacagtgctga 
atattgtogagaagatacctgtccaaggaatacagatttgcattcctactta 
atgtgtggtcttataaacaacatttaaacaagjttcatgagttactgtgtaaa 
tgttaacaatgtttagcagtitacaattgcattactttraaaaggaaaatgag 
taatagttaatgcrcgattgactattaaaatctltatttcatgacaagaagac 
ctgaagtagtataactaggtacctttataaagctaacaatgccgctggagct 
ccgaccacgagcatacactctitcatggggaaatcccatccattcaaatattt 
ccaaatacactgtcgattgattggtatcataagcagaatattaggctagaat 
aaaataagtagagttttcgataatcaaaagataatgtacatttattgagtct 
aatcatgaaggtctctitrgatcatgtaacaggtttccraat(ntrggg 
aaattacagttgtcttgctggtttgttttcacttctttaggtcrctctac^^ 
atcttgtgggactctgggcaatccrcacatgctcagtgfrrtgtgggctcgcc 
ctatattccaggtaccatgactgtgat.ccttggacagccaagaaagt6tctgc 
agcagaccaggttcagtaccatgtctttcrracaggtgtattaataatatrca 
aaaagctrattagttgagaggaaagagcattcatattcttgtagagaaacgg 
aaagtggacatgccatcatcatcttacatttctataaaaclrrgttaagaa'rt 
tattttagcagatatagcaagaatgaagagtactgcatctaaaatgaaaata 
. tggaaatAccaggaaaaaatcaagcagagtgttaaataggaatagctcaag 
gttgggagtaggaatgtgaacacagttttttgttgttgttgttgtttgm 
tgttttttaaactacacaaaccatrgtactatagaacattttgtgggtgtttg 
ttagagcattcatttataagcaatitatcttccaatitrttaaatgatggaga 
gatggatccaataagtgaatatttactgagtaccaactatgtggtgagattct 
gtgccaagagcttaacatgcttcatttaattctcacaaccctgcaatcctgat 
trtagagatgaggaaatgcttctcagagaggttatataacttacccaaggtc 
acattgctaataagtaagaataactatgagcagttatcaaatacccaccatg 
ggccaggcactaccataatgcttcatataaatttcaacttttaatcttcacag 
tcaccctctgatgtaggttctatgattatctgcctttcaaagatgagagaact 
caggcctagagaagttaagtgaataactagctagtaacctggagccaggatt 
taaaaccaagcaagctgccaccagagtcctaactttgaacctctgtgccattt 
attgtctitcaaaaaggggaactagaattcaaacacaggattgcccaaccct 
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•AAAGCCTGAGTICITGCCAAAATTATmCTAAGACTCAOTGCAGAAC^^ 
TCCrAGGGGATTCATTAGAATGAAAATAGATTGAATTTCTGTTGATGCAAATG 
CATCTCATGCTCCCAGAAAAATACAACTTGGTGGGCCGAGAAAATAAAAACA- 
CCCTGGAGCTGTTTCTCAACCCTATCTTAAOTATGCCCrCCTT^ 
CACCCTCTCTGTTCTCTCAGAGAAGGTAAGAATCAGGTTTACTATQAAGATGG 
AGGCACACTTCTGTTACATCCCCTCTAATAAAGAATATTTTGTGACTGTTATC • 
AATCTGTATTGTCTGTAGTTTGTATCATGAAGACAATAACGACTITAAAA^ 
AGTTTGTMTATACTTTGCCTTTACCCTGGGCCAAAAAAAAAAAAAAAAAA^ 
ATCCCTACGGCTTCCrACCTTTGAGAeATCTTGTAGAATACATTCAGGGTGTC- 
TTGCTTGCATACGCTTAGAGAGTCCGTGAAGATTT'CTCCCCAACTGATATATr 
TCCAGGACTGTGTTAGtTAACAAGTTAATTCAATTAACACTTCACTGGGGATA 
CCATGCAAATAAGACAGTGGAAGATCGTGTCAAAACCTTATCCTCGCTCAGT 
CGCGGTGGCTCACGCCTGTAATCTTAACACTTGGGAAGGCTGAGGCAGGCAA 
ATCACTTGAGGTCAGGAGTTCGAGACCAGCCTGGTCAACATGGTAAAATCCG 
•TCTCTATTAAAAATACAAAAATTAGCTGGGCGTGGTGGTGGGCtCCCATAAT 
CCCAGCTACACAGGAGGCTGAGGCAGGAGAATTGCTTGAACCTGGGAGGTG 
GAGATTGCAGTGAGCCGAGATCGAGCeACTGCACTCCAACCTGGGTGAGAGA 
GGGAGACTGCCTTAAAACAAACAAACAAAAAATACAAAAAAACCCCTTATC 
CTCAAGAGAGCAAGGATATGTTTCTCTATGCTCTGCTGGCATTTGCCCAGAGG 
AAGGAGGCCAGTTTCTAGATATAGTtTtACTATTTTCCTCTCCATCAATCCTAT 
TTCATTGGTTCTACCTATACTTGAGGGGGCCACTCAGATACGAAATAATTACA 
CCACAGATTTTTGAGAAAGTTGAAGATAGGATAGAGGATTTTGGACCTTTTTA 
TAACACTATAGATGGAACCTACTmGCTATTTGTGGGACATTGTTTAAAGTT 
AAACCTTTATGATATCTTGGAAAAATTGGGTCTTGATCTCATATAGGGATAAA 
TGCTATTCCGTTTTTCTGAGAATAAAGATTGAGTAAGCTTTGGAAAAGTGGAG 
AACACAGTCCTAAAAGAACTGAAACAATCCrAATGTTGAAACATTTCtnTCA 
ACAGTGGAGGAAGTTCTTCCATATCCCATGAGCACACTATTGTTAAATGAAA 
■XTGAAGAGGCTATGGAAGCCGATAAAATAGGACATGTCATCTACTCTGTACT 
GTGGGAGAAGTAATCAATAAGGTTTTAGTGCAAATGAGAGGACACTCTTCGA 
GAAAATTGTCCACTTAGGACTCTTTTGATTCGGAAACTGATTTTGTAGAA^^ 
CTGCCATGCAACAGAGTCCTGAAGTCACACACTTGATTATCCTAATTTGATAT 
TTATTTTITAAAATATAAGTTATCAAAAAGCAAGTTAGTATCAGGA^^ 
TAACAGAAGCAAGTTTAAGGGATTTCCTGAAGTCATTCTCCACCCATCATTAT 
GTCTCGTACCTGATGCACCTAAAATTACATCTTCTGTCCTGGGCAGTGACTGA 
AGTTCACAAAATGGCCTTGAGTCATCAAGTAAAGTTAAGTGGATGCTGCTTA 
CTTMGCACAGAGGTGT GTCAA ATATTTCCnTAAAGACAACITATTATTCT^ 
•AGA ATAG CAACTACTTTGTmGAGCATTTATTATATTCCATATACTATACTAT 
GAGTTTTACATAAATTATCTAAATrAACTnTACAGCAACTCTATGAGGTTAG 
TATTATTACAACTATTTTATAGATGAGGAAACTGAGGCTCAGAACTTCAGTTA 
CAAAAGCCGTATCTGTGTGAATCCAAAGCCTCTTTACTTAACTACTGTGCTAT 
TCGTTrCTCTAAGTGTTAGTAGTTAAACAGTTTAATmAGGTATTTGAAAATT 
TCATTTGTGTGGAATAACTCCTTTCAGTTCCCGAAGGAGACAAGACAAATGA 
TAAACTAGGCATTCATTAATnTATTCAGTAGCGAAAGTACTTGGAAATAAAT 
TITGGAATTTTTCAGCTCATGCCTTATTTGGTACTGGACATTCTGCAAGATTAT 
CCAGGACTTCCTGGACTTTTTGTGGGCXGTGCTTACAGTGGGACATTAAGGTA 
TGAACTATGACTCTAATAACATATGATn:CCCn"GTTGGGATCTTTCri^ 
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ttagatactttagcgggataatgtggfagttctgggcacagacacagatg^ . 

caaagtaggcaatctgctgtatgtgtcctctctgtggtctgacttatctctgt 

ggactagcgtcctgttgtcatcggctccacttttaactaggtccctgctctct' 

a^tacaccaccatacatcctatgatgtgatacctatggtcacaataatgacg 

atcggttgatggtagcmcacrggtgatcaaaaatgggtgccacagtcttga 

ttgaaaacacacatggggctgaagcgtggtcaactggAaaAttagaatgaa 

atcttccatttacatgttgaataatatatactgcccaaagaatcttacatttt' 

.GTGATCTATCATTGCCCTTTCTCCrrGCGTTTGTTCCAGAGAATTGTTATTATC^ 

a^catgtacagtgtgtgttagtggggattcaggaaattaatattgttgatatt ■ 

tacagcacatggtagggaggacttatgacagtccittctatgcacaaagaaa 

aatacattttaaagttgttatgcataggaatacggagtaatctatgtagactc 

ttttagacacrgaggattaacagcaagtggaagcaacatgilu.catatccntt 

ctctmactgtcaagcctgtagatattgcctgaatatcatttttggatgata 

acagtttcaagaaagacagtgctgtgattattaaaaatagcatagtagagcc 

tggtgtagtggctcacgcrrgtaatcccagcactttgggaggtcaaggcggg 

tggatcatgaggtcaggagatcgagaccatcctggctaacacagtgaaaccc 

TGTCtoACTAAAAATACAAAAAATTAGCCAGACTTGTTGGCGGGCGCCTGT 
AGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGGAG 
GCAGAGCTTGCAGTGAGTCGAGATCATGCCATTGCACTCCAGCCTGGGCGAC 
AGAGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAG 
CATAGTAGAAATCAGGTTATATirrAGAAGTGACAATTTGTTTCCCCCrTTCT 

cattcccacttaaagtAttattaaaggtaaaatattataaatcaaaagtttta 

TTTTCCCTTACAGCACAGTGTCCTCCAGTATTAATGCCTTAGCAGCAGTAACT 

GIGGAAGATCTAATCAAACOTACTTCAGATCGCTCTCAGAAAGGTCTCrGTC 

TTGGATTTCCCAAGGAATGAGTAAGTTTCTGTTTTCATAATTCCATTTTAGCTC 

AGAGAGATGATTTTTtGAGACACAAAAATtTCmTrCCACTGAAGCTACAGA 

GGAAGGACCTCTGAATATGACTGGATATCCACATATGTATGCCTATGACAAT 

GCAGATTTITAAAAAATGTITGTATCAATGTTTAATGTTACCATATCTGTAAT 

caagatttgggagacactteaaacaattaattgtcagtgaaccacaaaggac 

aatttccaggggaccgatttagctgtccttttttcctgtgccttcatcctatcc 

taaatttgtgttaaaattgttagccacacacacaaaaaattcaactattttcc 

ttitcAactgtagtccacagttctaagaaatatccctagtttgtaaccaagag 

ccacacttitrctgtracctaagaaggcactgtcagtttcagttatgttgtcl^ 

cccataaacacttcccaaatgtttacagaggattagattaaattagattaata 

gattagatgaattgaagataaagaaacagagtgttatgaattttgactcgtc 

ttagttgtctgttctgtctgcttatgtacactgtcctggagatgaactataaa 

tttgtgcaagaaattctcaacttctgttctgttcaatcgtagagcctcattag 

GGGTTAAATACCAGCTTGAATAGAGTGGTTTAAGCATCTTCAGTTCCCAGATG 

tcrcaaatgtaatatccaactcaaagaaatcttgccaatgtactgcattcttc 
attttgacaccttgaaatgcatgactaacaaattccttttcgagaaagaatct 
ttaacctcaacacaataaeatca^tagactgtcaaagaaatagtaaattata 
ccccattagtagccataattctatgtaaaattgccacaactgtagcctggaat 
gagtccttaatttatatccttcagtattccctaaatttaaatagcaatgcatc 
attttattatgtacccaattttaatcctggaacataatttcagaatgatgcaa 
tgcttcctaaagggttttragactggttcattatgggattattgaagctgtgt 
ggtttagtagaatgaatagatgaatagatgcttttgaagccagacaaaccca 
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GtCrrGCTTTGTTAATTTA(h"AGC7rGTTAACOT 

TACAAGCCTCTATTTCTCTATATAGGAAATGGGCACACAAATACTTTGCAGAA 
. TTGGGTTnGAGATAATGCATATAAAACGGGAGGCACTGGCTGGTGCTCATG ' 
GAATGATGTTGATGAATGACAAATCAACTCTGCTAATATAGACGAGACTTTC- 
ATTCATnTAAAGGGTACTATGAAATCACAATCTTGAAGTATATGCAGTAAG 
GACATAGAGAAGAATGTGGCAGATGAmGAATGTGTTTCCAATAACTTTGA 
ATTCCCAAAATTATAATGGTGACTATTQATAAC.CTATCAAATGGAGAATCAA • 
ACCAAATATATTGAAAATATTATTTACATTAAAATTGAGAGGCGTAAAAATC . 
ATAGCAATGAGAACTCTTCAAAAAAATCAATTTAAAATTTTTCAtTGAGATGC 
ATTATTTACTCMTACGTTTAACAAAGGTATGAATGCtAAAATAAAATAGAA 
■ATAACATAAGGAATACAAAAATmATGATGGACAAATTrCCCArrCATTTTT 
TTTCTGTATATrGAGTTTTGCATTTGGGAAArrATATCAACTGAm 

acttttgctgaatggaagtattaaaacaaagggggtttttttgcXtgatattt 
cctataatitaaaaggtaatacatgtacctggagaaaatttagaaaatactg 
amgtcaatagaaaaaaatattccacaatctgaccaccagtggtggtttgaa ■ 

AACTTTTATAAAGAAGTTTGGGCTGAGTGTGGTGGCTCACACCTGTAATCCTA 
GCACTTTGGGAGGCCAAGGTGGGAGGATCACTTGAGCCCAGGAGTTTGAGGC 
CAGCCTGGGCAACATAGCGACACCCCGTCTCTATGAAAAAAAAATlTnTTA 
AAGAAGTCTGGGGAAAACAACrrAGCATTAGGGCAGATGTGCTACTTATCCA 
GAAGTTGCCTTTCTTTGCTAGTTTAATAGGAAGGGCTTGAGGATACTGATGGA 
GATTATGAGGGGGCTAAAAGTCGTCCAACACCCCATAGTGTCCATTGCCACT 
TCCCAAGGGAAATGAATGCTTAAAGTCAGAAGAGTCTAATTTCTGTTTATTAC 
TCCnrCTCTCACCTTGTACAiGAGCAGAGGTGAATAGTATTCTATTnTGGCAA 
GCTGAAAACAGAGACCTGAGCCmCTTTATATACAAATGTTTATGGATGATT 
AGATTAATAACACAATATAGTTCTTAGTTTTAAATACCTATAGTTTATTCCAG 
GAACTCnTACTTATATAACCTACTGTTGTAACTAATCCTGGQACACAATGTA 
AGGGCTTCGTCCTCTTGAAACACTGCTGATCCTAGAGGAAAATAGCCATTTCC 
TTTATTCACTGGCTCTGATGTGTGTGGCCATTCTTCACCACAGTCATATTATCC 
AemGAATCAAAGGTGTGGTGGATrATTCrAtTGAGAATTCTAATrCTCTGG 
GTGTGGATTTTACACTGGCTTTTATGTTGTCCATTTAGGTGTGGTGTATGGAG 
CCCTGTGTATTGGAATGGCTGCGCTGGCGTCACTTATGGGAGCTTTGTTGCAG 
GTGAGAGCTGGCCCCTGGAGGTTTAAGTCATAAATCACTAAATCTTTTTTCAA 
tGTTGATGTGACCATCertCCAGACTTCTCTCGATATATATCGACACCTGGAC 
ATATCAAGTGGCAGGGATGACTACACTTTTTAATTITrTTTAAITAAACm 
GTTTTGAGATAATTGTGGATTTACATGCAATTGTGAGATATAATACAGAGAG 
.ATCTCATATACTCTTTACTCAGTTTCCCTCAGTGGTAACATCT^ 
ACATCTTGATAGTACAATATCAAACTCATATATTGACATTGATATAGCCAAiGA 
TACAAAACATTTCTATCACTACAAGAATCCTTGCTGTTGCCCATTTGTAGCCA 
CMCCACTTCCCTTCTGCCCCTACrCCCTCCTTAATCCCTGGCAACAACT^^ 
CTGTTTTCCATTTCTATAATTTTACCAGGTCAAGAATGCTACATACATGGAAT 
•TACATAGAATGTAACCTTTTTCACTTGGCATAATTCCCTGGAGATTCATCCAG 
GTTGTTGCGTATGTCAATAATCTGTCCTGTTTTATTATCAGATAGTATTCTCTG 
GTAGGGATGTATCACAGTTTGTTTACCTACTCAGCTGATGAAGGACATCTAAA 
TrGTTTCCAGTTTTTGAGTATTACAAACAAATCTGTTACAAACATTACATAAA 
GGTTTTTGTGTGAGCATAAGTCTTCATTTCCCTGGGATAACTACCCAGGAGTG 
CAACTGtCAGGTGACTGCTAAATGTCTACTTTTAAAAGAAACTGCCAAACTAT 
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TTTCCAGAGCATGTCATTTTTATATCACTAGCATAGACAAATGGCCCAGTTtA ' 
AACCTCATTCTTTCGAGCATTTAGTGGTGTCITITri^^ 
GATAGGCATATAGTGATATCTCATTGTAGTGT^AATTTGGATTTCCCTAATGG ' 

• CTAATGATGTTGAAAATGTTTTTCAGCGACTTATTTTTCATCTATGTATG'^ 
TTCATACATTATCTCATAATGTCTirrGCTCATGTTCTAATTCAATTGm^ • 
TTTTTTACTGGTGAGTTTTGAGTGTTCTTTATGTATrCTGTATA^ 
GTCAGATGTGGTTTACAAATATITrcrTACACTGTAGTTTGTCTTTTTATCCT 

■ ATAACAGGGTCTGTCAAAGTGCATTTTTTTTTTTTTAGTTTGQATAA^ 

gtttatcaatttgtcctttcatGgattgtgtttctggtgtaaagtctaa 

TTACCTAGCCCCAGCTm<3AAGATTTTCTTCTATGTTrCim - 

TAGAGTTTTACAtlTTATATTTAAGTCTACAATCCCmGGAGTTAAT^ ' ." 

TAAAATGTGAGACTTAGGTrGACATTCTCTTTTCCTCTATGGATGTGCAACCA 

GCACCATTTGTTGAAAAGGCTTrCTTCCATTGACCTGCCTTTACACCTTCGTA 

AAACGTCCATTAGGCATATTTGTGtGAGTCTATTTCTGAATTCTCTGTTTTCTT 

TCATTTATTTATGTGTCTGTACrTCTGCCAATACCACACAGCTTTATAATTTGA ' 

TTATTTTGATTACTGCA GCTn AAAATAAGTTTCAAGATCAGGTCGATCGATT 

CCTCCCACTGTATTCITATTTTTCGAAATTGTmAGCTATTCTAGTTCT^ 

CTTTCCATATGAAGTCTAGGATAATCTTGTCTGTATCTACAAAAAAAATCTTG 

CITAAATATTGATAGCCTGAAAGGiTm 

TTTACCATGTTGAATTTTCTAAAACATGAACATGGTATGTCTCTTCATTrATTT 

AGCTTTTCrATGCAAATCGTATTTTTTATGTTGATGTCTGTGTGTTCAATGCT 

AAATGTAGAAATAAAATTGATGTGTTTATATTTATCTTCAATCTTGTGACCTT 

■ GCTGAGCTCACTTATTAGTTCTGATAATTTTTTGCTTCmGT^ 
GTTTGTTTTGTTATATTCCTTGAGATATTCTACATAAACAGTCATGTCATCTTC 

• GAATGGGGCAGTTTTATTTCTTTCCTTCTGATCTGTATGAATGCCTTTTATTTC ■ 
CnrATTGCACTGGCTTCAACTTCCATATCATGTTGAATAGAAGTAGTGAGAGT" 
GGAAATCCTTACCCAGTTCCCCAATGTGAACAGGAAACTCTCTATTCCTATTC 

.. CAAATGCTTTTTCTGTAATCAATTTATATGATCATGTGATCTrCTTCTTTAGCC 

tgcttacaggatggattacattgattggttttttaatgcagaaccaqccttgc 
ataccgggaAtaaaccttgtttggtcatggtgtgtagttatttttatatat^ 
ctgaattatatgtgctAAtattttattaagaatttttacatctatgtxcatgaa 
ggatattgatctgtagtggtgtgtgtgtgtgcatgcatgcgtgtgtgtGtgtg 

TGTGTGTGTGTGTGTGTGTATACTGeCtTTGGTTTTGATAGCAGGGTTATACT 
AGCTTCATAAAATAAATTGGGAAGGATTCTGTrTrCTATTTTCTAGGAGAGAT 

■ '..TGTCTAAAATTAGTGCTAATTCTTCCTTAATTATTTGGTAGAAtTCTCTAGGG 

AAAGCATCTGGGCCTGGATTTTTTTTTAGTTTCAAAATTAT^ 
TTAATAGTTACAGAGCTATTCAAATTATCCATTTCATATtCGATGAATTGTGA 

• CAATITATGTTrrTGAGGAATTGTTCCATTTTATCTAAGTTTGCAAAm 
ATGTATAGTTGTTCATAGTAGTTCTGTACTATCCTTTGGCATCTGCAGGACCT 

. GTAGTGATAGCCCCrGTTCCATTCCrAATATTGGTAATTTGTATCTTATTTTTT. 
TCAGTCITGCrACAGGTTTGTCAATTTCATTGATGTTTTCAAAGAACCAGCTT 
•CTITITO.CATTGATrTTTCTGTGTTGT^^ 
ACTTTTATCTTTATGATTTCTrTTCTCCTTGCTTTGAGT^^ 
CGAGGTGGGATCTTAGATTACTGATTTGAATCTrCTCCTCTTTTCTAATGTGTG 
CATTTATTGCTGAAAATTTCCTTCTTGGCACCAGTTTAGCTGTGTTCCACAACT 
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mGACATCTrGTATGTTTGTTTACAGTCAGCrGAATGTATTm 

TTGAGATATt:CTflTGATITATGGATTATTTAGAAGTATGGTACTrAGm 

AGTATTCAAAGATTTTCCTGTTATCTTTCrGtTATTGATTACtAGTTTG^ ' 

ATTGTG ATCA GAGAAGACACrCTTTATGATTTCAGTACTITr^^ ' 

GGTCTGTTTTATAACCTAGGATATGGTATTTTGGTAtATGTTCCATGAGCACT 

TGAAAAGAATGTGTATTCTTTTTITATTAGGTAGAGTGTTCCATAAATATC^ 

TrACATTCTGTTTGTTGATGGTTGCTGATTTT(n"GTTTGGTTGTTT^ 

TTGAGAGAGGGGTGGTGAATTCnrCAACTATAATTGTGTATTTGTCTATTTCC 

CTTTTAGTTCTATCAGTTTITGCTTTGCTTC^^ 

TTAAGCTTTTTGTTATTATATAATGTGCCTATCTGTCTCTGAGAAATT^^ 
CTGTGAGGTGTATTTATmATATTAAtATAGTTACTTCTACTTTCCATTGATT ' 
MTGGTTGCATAGTATATTTITCCATTTTTTCACm 

ATCCATCCTGTCAATCTCTGtCTTTTCATTGGTGTACTTAGACCATTCACATTT 

AATATAATTATTGGTATGTTATGTTAGGGClnrAAGAATCAGTTTrtAGTT 

TGTTTGGTATTrCTGTTTTTTGTlTrCTi3C(^ 

CTrCTTTCTACCATTrCCTrTCTGTTTAGAAAACTTCTTTTAGCC^ 
. TGTTCCTTCACCTGAGAATGTCTTGAGTTrcCCeTTTATCCCCAAAGGATATTT 
.TTGTTGGGTATATGATTCrGAGTTGCTAGTATTTra 
AATATTirrCTACTTTCTTTTTGTCTCTATCATITCTGATGATAAATCT 
ATTCT AATT GTTTrCCCTTACAGGTAAGGAGTCATTTCTTTCTGAATGCTCTCA 
GGTTTTTTTGTTTTGTTTTGTTTTGTT^^ 

TGATGTATCTTGGTGTGTATTTCCnGtTTTGAGTTTTTGATGTGTCTTGGeAC 
GTGTTTCTTTCTTTTCCTATGTTTTTAGGGTTACTCAGCTTC^ 
CAGCTTCmCATAAGTCTGAAAGAGATTTATGTCTCTTATCAAATTTGAGAA 
GTrCAGACATTAATTCTTTGAGTACTTTtCCACCCCAT(m'CTTTC^ 

ctgggactctcatgacatgaatgttaaatatttttgttatagttctacagatC 

CCTGtAAGTITATTTTATATTTCTTCATTTCATTTCTCTCTATTGT^ 
TTATTTCTGTTGTTCtGTCATTCAGTTCACCGATCTTT^ • 
. ACTGCTATTGTGCCCATTCATrTAAAATTTTTT^ 
CAQAGTAACAGTTTCATtTGGTTCireCTAATATCITCTTTTm 
TinnrCrCTCTATATATATTTmCCCTCTATITmCAT^ 
CTAATTGTTAAAGCATTTTTTAAATCATGGCTGTTTTAAATT^^ 
ATrCTGTTTCTGTAATCTTTATATTTAGTTTGAGATCTTCCTGGTTCT^ 
AAAAGTTAATTTGTGTTGAAACCTGGGCATTTCAATATTATGTGATGA^ 
TGAATCTTACCTTCTGTTTTAGCCAGCrTTCTCtGACACTACGTCAGTAGCATA 
AGGGAGGGTGCCGCCTCATTACTTCAGGTGGAGGTAGAAGTCCAGAGGGAG 
GAGCTCCTTGTTATGCTGGATAGGGGTGTGTAGAATACTATGAGACATATTGT 
GGCTATAAGATTAATGATAGTGCATGAGGGCCACTGAAATATCTTGCAGGGC 
TGATACTACATGTMGTAGQAATTrACAACCCTGGCTCAGGGATTTCCAGGA 
AAAAAAAGCCACCTCAGCACAGAAGCAGCTTTCATAAACCTTAGAACAAAGC 
TTACTTITACAATAATAGCTTAAATACCCTTTATGAAAGAAACAGCTGGTAAC 
TAACCTGGAGTAAATACAGGTATAAGAAAGGGAGAAGGACCCCCAAAGTCT 
GACAAtGGTCTCTGGATGAAGACTCTCTGGTCAGTTCATGATCTGACCCCCTG 
ACTGTATCTGGCCCATGACACCAGCTTATTCTCACTATCCATCTTCTAAGAGT 
GCTGCCAGAATAAACCGATTGAGCATTAGATGGTGCCTAAGACTCATCTTTG. 
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ATGTGAAGTGAACAGAAAAGGAGACATCGCCCCTGGGGAAGCTGGT!rAACT • • - 
AGGTCCACCTACAACCTCTGAACACAACTGGCATTGAGGATAGGATAAGTAA 
GTGAGCAAGTAAGTAATGGCACCCATTTCTAAGGAAGGACTGGGAAAGGGT ' 
GACTGGGGACTGGTTCTGATCAAGAAGTCAAATGAAGCCCTCCGGAACACTC' 
CGTAGTGT.(3TCTGGTAGTTTTTATTTTACTTTGTGGCCT^ ' 
ATACTTTGTCTGTG<:CTCTGTTGTGAAATTGCCACCATGAAAAACCTGeCAAA ' 
ATTATTCTTAAGACCACTCTGGTCCCCAGTAACTGGGGACAGGCTCTAAGGA 
GGAGGAAGCAGAGGTGGAGGATGCCCCGGCTTGAAGAGCTAACCACTGGCT • 
GGTGTGTCTCTGGCTCCtGCTQTGCkjAGGGTTGCTCCTTGCCCCCAAGGATCr 
TGGCACTGTGATGCCACAGACCAAGAAGGCAAAGGTGAAGCTTACCTGCTGA • 
AGGfGTTATGTGCrGTTGCCCACCAGCrAGGGACTTAGAAGCATATTTCCCAAT 

agatccttggactaacaccaccagtgaagtccacaagaatcatcaaagggca . 
tgtcggagctgttcccagcccaccatgcactccagcrgagacagtgtggcrtc- 
. tcaaggggtgtgccccagcctgtgatgtctgctgagtgttgctggcagcagc • ' 
ccataggcaacatgcccaagtcacgcgggctcacctcagagtcgtgteattc . ■ 
tgcctggtcgatccagagccctgtgagctgtggaa*rgatgatcctgggtgtg 
ctggcttaccgctggccacaccacaggtgtcaccagtcatcacttgggctga 
gaaagctgctgccacctccacagctggctgagatgaggaagtgtgtcctaga • 
gaccatatiggtccatgaagaggttcacccctaatgtggacagaaattcaaag 
cctgctcaaagacctggtgcaggaacccaaagagaaaggcaccacctggctg 
tgctgtgtgtacctgagaggaagtgctttgcaactcataggccgagaaatac ' 
aaatacagcagccagcggctatctttaaagataataagatcatttccttattt 
caagaccctgttgagcttatccaggcacagagtagagtgggtagagataact ' 
gtccctgcagtccgctgcagtggctattcaggcctggcgaatagcaccccca 
gaatgagcctcttttcatcaagggctgaaggtggctggtgcacactgaggaa 
• gaaggggagccactgtggcgcttaggcatgctccagtgagtttgcacagccg • 
ccactcccatgggacaggggcacaactgggagaccttggaacagaccacctt 
gcacccggatggtttacagatgtttttaaggagagcccatcgtaagccaaac 
caactctcctggccctgatggccacagtgcctgaagttaattgagctctgtag 
gtcctcatcaccctggaaaatgctgaagaatagctaagaatccctaagaaaa 
cttagggaactcaccctataecactcttcaggtccagggtccctagcgggctg- 
ccttctttgcaccotcagacagaaccatccmcagtgttcacaagagcacct 
cgctccctacccaccacacaggaccttgcagccagaggtctcacactgggct 
ctgctctccctggccttctcacacccAtcacccacgggatgaggcccagtcca 
■ gctgttcacaagggcagagtgacctctactggcagatacgctgaagggtgct 
tccaaattggctcctgtctggcaagcctcatgaattcctggaaactctcc'ita 
ataccaatgtcgtccagtcatcccacccactgacccaccacccatctiggatgg 
ctgagcatcaacitccacctttgggaagatgtcttcactgctaacattcctgc 

CCTGTGTCCAGGATGACCCACCTGGACTGTCCCACCCACC.CTACCTATATCCC 
TCCCTGGCAAGACCmACTGGTATATTTGCCTAGGGGTCTCTTGGGCCACCA 
CTGTTCCTGCAGCCAGTGCCTCCTAGAGACACTTGGTGCCCCAGCTGTGCCTG. 
ACAGCCATCTGGAATGGACAACACCTTGAACTTTGGCTTCATCGGGCCGTGTT. 

caagAtcaggattttggatctttgacagcacctcagcacacacagtgtcccca 
gacctaattgcttagttcfgtgtccrccttatccitctgttccattttcacctg- 
tcattgctctgtaatcaagcaactctaagcctgctaccactgagtccattcaa 
ctcctttcacctgctcatgtaatcgtgtttccaagccccagtgtgcccgtgttc 
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TTTTGCTGCTACTCCGGGTGGCCTGACTTTCCACATCCAAi^^ 

gacctattcaccaattacttggcactgaggtcactgacagcctccaccacatt' 

TTGGCCACCCACT(3AGCAGCTAACCTTCCAGCCrGCTGGGCATGCCTrCATTG ' 

GCCTGCCACACATCCATTGCTTCCCGTCCAGACTTGGTCGCTACCTGTCCTCC 
ACrCTTCCCCCATCATTGGrnA<rr(rrrirrnnA a knr^r^An>Anr>r^r, . . . : 




• TGTGATGATACAGTTCATTTCCAGGTTTTTCCAAGGGCGCCACCAACTCCTAC 
.TCACeGTCCTACAGAAGATCCATTTCTCAAGArrTGACTGACCATGGAGCGCA ■ 
CmCAAA(nTGGGCAGCCTCTCAATTATTGACCCCCTCA:GGGTCAGGGGTGG ' 
AGTGTGGTATACAATGAGATATATTATGi3CTATAAGATTAATGATAGGCATA" 
AGGCCACTCAAGCAACTCAGAGGGTCAGTCCTATCTGTCAAACTTAGCAGTA ' 
TGAAGCAACCTGGCCGGGGATTTCCAACCAGATTTCCAGGAGACCAGGTCAe > 
CTCAGCACGATGCAATTTTCACAAACCTTGGAACAAAGCTrACCCTTACAAG 

.CATAGCTTAATCTCTCTTTGTGAACAAAACACCTGGTAACTGACCTGGATtGA " 
. ATACAAGTATAAGAAAGGAGGAAGGATCCCCTAAACTCTGAGAAAGGTCTCT ' 
AGATGAAAACCCTCCTGGTCTGTCAGTCATCTAACCTGTGACTAAATCTGi^ 
CACGACACCATCCTGCTCCTGCTATTCTTCTGGTAAGAGCACTGCCAGAATAA 
AATGCATGAGCATCAGACGGTGTACAAGACTCAACAATGATGCAAAGCGAA 
CTCAAAGGAAGAGGCTTCCCTGGGAAGCTGGTTAACTAGGACCACCCGAAAC 
ACGCGAGCACCACAGGGTGGAATTCACTGACATCATGGAGGAGGTGGCTTTG 

• TCACTGCTGGGCCATGGTGAAAGTCCCGATTCTCCACTAGGCCTCCCATGTCA 
TTGCCAGCAGGGAGGCAAAGGGTGCCTGGTGACAGTCTGGGGGGATGGAAT 

. CTAGGCTGGCATGGGTGTGGGTGGGGTGAACAGATTATTCTGTGGTGTTCGG 
ATGGAATGGAGCGGTTGGGGTCTAAGAGTTTTCTATCrTCCTAGGCTGCrCCT 
TTCCTGGC CCTGT GTCTAGAGAGAGAACAGGATTTTGTGAGGGCrmTATCT 
TTTTAAAGTTTITOGCCTGTGTCTGTTGCCATTTCCAGGCT • 
CAGTTCTCAGGCTGGGATACATGAAGCAAAAAGAAAACTTAGGGAACTCACC 
CTGTACCATTCTTCAGGTCCAGGGTCCCTAGCAGGCTGCCTTCTTTGGACCTTT 
CAGAGTCTTCTTATGTTTATTTTATATGGAATGTCAGGATTm 
GTAGGAGGAATAGAGAAACATATGTTTATCrTCrCAGAAGTAGAAATACTTC ' 
ATTAAAAAAAATTTGTTtATTTACCrGAAGGGTCTTTCACCTCCTGTGGCGTT 
TGCTCTTCTGATTAGAGAGCATTTTTGTGTTGGAAAGCTCCCTCCTGTTCCTGT 
GGAGCTGTGGGCCTGGAGAGGACAGACCAATCTTTACAACAGTGTCTTCTTC 

• CTGAACCCTGTGGGCACCTCACTGTCTTC AACACGCAGGCCTTGGCATAGACT 
TCTCCCTCTGCCTGCAACACCTTCCTCTCTACCTCGTCCACTGCCTAAGCCTTA 
GTTATTCTTCAGTTTTCATTTrAACACATTACTTATTTAAAGAGAC^^ 
TTACCTGGATACATTTGCATTCCCTTCTGCTAAATGTITCTAGAAGCACOT 
AaTrCTTTATCATCrrCCATTATCAAACTGTATrACAGTTAmATTTAGto 

tgttttactgaatAgcctgtaattcatgaaggtggggaagatgacttcgttgc 
. tcaatcctgtatcaaatatctggttcctactgcagtgcccaggacaaagcaa 

• GACATT/lACAACTTTAGCTTGAATATTAACTAATAATGTAGTTAGGCTCATAA 
ATATCCTATTATCTGATGACCCCAAGTGCTCCCTGAAATTTCCTTGGCCACTT 
GGGCTGCrCCTCTGGGATCCTGGACCTCCCCTTGACCTAGAAACTTAAAGGCT 
AGTACrGGCACAGCAGGGGTCCTCCAGGATCTTTCTtCAGTCCTAAGGCTTGG 
.GGCCATTCTGGTTGGTTGTTGTTTAGGACATAGCAGTTGTAAAATAGTTTGAC ' 
CCCATCCCTGATTATATGAATGAATGAATTAGATCCAGAAGCTCCAAGCCTG 
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CCCAGCACTCCTAAACCACAGCCAGTCATACAGAAmGCTTTCTTCATAGTC 
TGCATTAGTGCCTGGAGCn'OTATTGTGGTTGATTCCCCrrTm 
.GCTATGGCCTGATTATATACAATGACATGTTCTAGCAAAGTCATCATAACTCC 
AAGCCAGTTCATCCTTGCTTCATCTCTAGCCAAGGTCTTATTTCCTCTTGTAAT 
CTTTTTTGGGTGTtnTTGTTATGTCTCCAAAACrGCTCACAGATCACAC^^ 

atcActtgctttgtgggtcatatgacgtcttgcagtgaagtctatctcccttct 
•gactggactgtgtgtaaactgactcaggatcacatccttgagtccgcatcctt" 
ttaaacactagttctcctctcractcagtaccacatccttttcttctt 
agtaggcctatttaaaacaacaacgac^^.caagaagcactatrtcgtctaat 

■ attrgcctgacttggtattttctctttgattcattcattctgct 
cctttactctctggacaaaccagtgattccctgctcttcttttatggttgtgtc 
catt<:ntaagaa.crrgggcatragattattgggaacgtatatctcttggaagc^ ' 
tcatttagttgctttgcctcatgcgatgtttgcactgtaaatctrt^ 
catggatttgatatctctacatcaaagcaaaggttgcagttatctaattagat 
titaaatctatttctacatgatttgatcagtctttttaaaatcgtatttct • 
tctagtttcatccttagggaatcaaatcaggaatggcaaatgggtttcatcct 
cctctctggtcacatagcagtactcgtccaccgttatgagaggattttgagat 
tatcaaagcacaaaggagtgctgtgatttataaacacctgcaagagcaatgc 
taagaggagaggtatcattcatgatttatacattcacataggcactaccattg 

TCGCAGACCTCTCTATCTTCTTTGTGTTGTGTGGCCTTTTTTGAGGCrA(^ 
• AGAAACAGATGGATeCTTGAGACTGAGATGCAGAAACTTGTAAGTTCTAATG 
Acm'CirrrCCAGTGATAAGGCTATCATGACTGAAACTATGATTTTCAGAAGG 
AGGCCGAATACTTTAAGTCATTATCCTGATGAAATGACnTrGAAATATTTGAG 
TTTCGATTTGAGATTGCTAATTGCTGACGTTGTATTATTTITGTAGGCAGCACT 
CAGCGTATTTGGTATGGTTGGTGGACCACTTATGGGCCTGTTCGCTTTGGGCA 
TiTTGGTrCCCTTTGCCAACTCAATTGTAAGTACAAAGAATGAAT^^ 
GGATTACTTTTTGAACTATACTAGCAGCTCTACACnTITrTCTCAGTTGGT^ 
TTTGAGATTTGTCATTAGCTACTTGTCTTGATGACTTAAATTATTtCTGTTGAG 
TTTGGTGGAGTGTACAAGGAAGTACTATGTATGGGGACCCTAAATGTGTGAA 
GCTCAAIGGAAACTTCCAGAACCATATAGGGCAATTTTAAATATTCATAATAT 
AACTAAAGGAGCAACATTTTTATGCACGTACCTACTTGCCCTTCTCAAAAATT 
AATAGCTAGGATTTAAAAGAGATCAATAGGCTCAGAGATGAGAGATTTAAGG 
: GCAGAAAACTTAGGATTTCTGTGAATAGCCATAGCACAGCAAAGAGAAGGA 
ACACAATCTTACCACCTTGGCCAGGAATTTTCTACTCCTGACATTTAAAGCTG 
.TAGCTCCCACAACATGATTTAGCCCCAAAGGGGTGATCATAATTGGGAATAT 
trTGCAGAAAGACGTA^TATCTCTGTCCTTCTACCTAGACATATCTGTGCTTA 
GAAGTGCTAACTTTTGTTTGGAATAAATGAGCTAGAAATTATTTCTCTGAAAC" 

ccagaagaaagtccactggttcagtctggctataagatgtagttcaggaaaa 
. actgataatgtatgtgcagtcgcttcagAtatagaatagccatagaactctg ■ 
actrcacatttggaattctattttcctrataaggcattaggaattggtaagc^ 

■ AGATArrAATAAGGACTATTGTGGtrATTATTTATTATCCTCAATGAAATGTC 
ATATGAAAAGCTGCTTTTGTAGGAATTAATCACAATGGAAGAGATTTGTTTGT 
GTCCACCTGAGAATGTTGAAGCTTGGATTAGATTCTTTGCATGATGAAATGCA 
TTAGTTTGTTAAGTATTAGAAAAATGTATTTAAAAATCAACTTTTGATAm 
GCATTTGTTGCCAGCGTGTATTTCCTCCGGAAGAGCTGTAGCTGACTAAGCTA 
■ ACATGTCCTTTTCTGGGGGCGGTAACGGAAGAATAACATACTATTCTTCCTGC 
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CATAAACAGTATCnrmriTtmA^ 

miCTGCCrrCCnTAAAAGTGGCCTTTCrCACTCCTTmCta 

AATACCTTTCAlTCATTTATrCACTTGTrCAmATTCAATAGATrTTTATG^ 

.CTGAGeACTATCTTGGGAGGAGAGAGCGTTGAATGTTTCTGCCCTTGTGAGTT 

TTATAATTrCGTGCCAfGTTGAACTrGGCTTTCTAATATTAAT^ 

•AGCAAATTGtATTATTTTTCTTTTAAAGGCATTCrrGT^ 

GTrGGTCTGATGGCTGGATTTGCCATTrCTCTATGGGTTGGAAITGGAGCTCA 

AATATATCCTCCACITCCTGAGAGAACATTCklCATTGCACOTGATATCCM^ 

gctgtaacagcacctacaatgagacAaatttgattacaaccacagaaatgcc •• 

ATITACTACTAGTGTTTTTCAAATATACAATGTTCAAAGGTATTGAAT^ 
.TTTATTACATTATACmAAAAAATTTACGCAACAAGTAGAGAACCGCACTTG ' 
QTirrGTCTTGCTTACAACACrGTGATTTTGGCT^ 

ACCCATGTGGTTAGCTATAGTATTTTCTGCAGCGGTAACAAAAAATGTCATAT 
•mCATAATTTCrCTAGAAATTTCTGCCTTGTCIACAGATCTAAGA^^ 
MTATTAATGAGAAACTTCTGTTATGTGTAAACrCTCCTAAACACCAGCrCTT 
AGCTGCATGAAGAATTATGtlTGTCTTGGAAAAACTTTTAAAATGGAAAGCA 

caattatagaataaatattgotatatagtcttagaaagatagattttactga 

ccaaaagctacaattatttaaacatgitaaataactgccatttgttcagttga" 

agattccaaatctttaaagcattagaagtgattgcagctgtgatcttcatgct 

ACAGATTTCAGTCATGCAGTACACTTTGGAGCCTCTAAATGCTGAAGTTGTCT 
GATTTAACGTACTGAAATAGTGGGTAGAGGCATGGTTTATTGTACAGTGAAT 
GTGAGGTCAAGACnrrcnTrAGTGGATATATAAGTGTCCAGCTTTAAAGCACA 
AACCCTGTGAATACGTTCAAGGAATGCAAAGATGATGCCATTGCCCCTAGAG 
TATTGCCCCAGTGCCAGCTTATCTGGAATGACATCAATATAGtCATACCTTTG 
GGGTCAAGACACGGCATACCCCTCTTAAAATGGACACACTCCTGAGAGAAGG 
AAeGTGATCATACATATAGCCATTATTAGCAATTTGTGTTCTGGAGAGAATTC- 
AGCTATGAGAAAGCTAGCCAGTGGGACTTCCAGGGGCGTTCCAGGCAGGTAA 
GCTGTGTATGGTAGAGTGGAAAGGTAGTTCCAGAGCCrGAGGCGTATTTGAG 
GGAAGTrTTACTAGGTGAGGTGCTGTGGAGAGAGGGGCAAGGACATGTGGA 
GAGCAGTCTGGACAGGCATGAAAGCAdTAATGGGAGGGGACTGGGAGAGTC 
ATCAGAGTCAGAAAGGGAGAAAAGATTAATTGAAGGTTAAGTGAGAAGACA 
CAGGATGTTCTTCITGTTTTATGATTTTGCTTAGCGTm 
TATlTGCCCTTAAAAGTCrAAGGCAAAATGGTTATTTGGATTCGTGAAACAAA 
TAAATTAGCTGAAGTTTATTCTATGCCTGGTACCATAATAGTTTTATATGTTA 
GmACTTATCAAATAmCAGCTAATAAACAAGAATTiTTGAGCATGGACTA 
. TGTACCCAGCACTGTGCTGGCCAtTTATrrAGTATCC ATGGAAGAAAATCrrC 
CTCrCCCAGCCACTGTTTCCTGCCTTCCAGAAAATTCTAATCTAGCAGGAGGG 
AATGTGCAAATATACGCAAAACAATAAACTCAGTAAGCATGCTAAGTATAAA 
MCTAAGACATAGTAAATAAAAAATACTTGATACATTTTTGTTAAAAGAGTG 
AGTACAAAGTT.GGAGTGTAAGTTAGCATGAAGACTTCAGAGATATACTAGTA 
CrTCAAGAATGCATAGGTTTGGGTAGATGAAGGGGAAGATCTGGAAAGTCAT 
TGTTGTGACAGAAAATGACCAGAAGGGAAATTGGACAAGAAAATAAATTAA 

aacatgcattgtggtaagtaatattcacaggatatcatgqgagccagaagga 
ggaagtctgcctgactgaagtcaaggaaaccttcttagaggagaattgcatg 
ggctggagttagtcaggtagcaggattgagcacagggacacaatctgtatcg 
caaggaaggaggagtgtgtgtcgcagaatgaagaaacgaccggtgcaaaag 
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CATGMGGACTGAAGAATCAGGGCATOTAGGAAATCATCCTAGTTCCATAG 

. GGCTGGAGCACAAGGGTTGTGAGCCACCCAGCTGAAGGTTGAATGTGTGATG ' 
CTTTGGTTGCCTAGATACAAAITGTTTCAATTGCCTAGATrCAGTTTTGAT^ 

. mAAGTITACAATATTTGATGGTTAGATTGGTAGCtTGCTGTGGTmAAm •• 
GTAraCCCTTAAGATTAATGAAGTCAAATACATTTTCATTTGmGTTGGCCA 
•mGGATATTTACTTTrGCGACTTAGtTATtGAAGTCm 
GGATTCTCTGTCTTTTTCnTAeTTACATAGGAGTrCATGACATATTCTACATAT 

. GAGATATGTTTTGAAAATGTITCTCTCCCATTCTATGGGtTACCTTTTTACTCT 
.TTTAATGATACTTTTTTrGTCAGTCAAAATTTCTTAAT^ 
TCrCAGTITTTTCCTATATGGTTAGTeCTTTAfGTGTCiT^ 
TGTCTACCCTGCAATCATAAAGATATTCCACTAATTTTTTATTTAAAAGCn^ 
TTGTTAACCATTTCCATTGAGGAAGTGTAATCCACTTTGGAATTGATTTCTG 

.•GGGTGGTGTGATCAAGATTATTTTTCCCCATGTGGATACTCAGTTAACTAAAA 
ACCATTrATTGAAAAGACTCCCCTTTTGCCCCATTGAATGCACTGGCGATGAA 
•AACGATGTTAACCAAGTAACTATTTTTCTGTGGGTCTATTTCTCAATTCTCTAe 
TCTGTTCTATTTTGTCTGTCTITGGCTCAATATCAGACTCTCTTAAm ' 
GCTTAACTCAATGTGTTCCCCTCCTTTCTGGGTTCTTGTCACCTCAAGTTCTCA 
TtGTTTTGTTTTTGTTTTTGTTTGAGACGGAGTCTCAGTCTGTTGCCCAGGCTC 
GAGTTCAATGCAACCTCCACCTCTGGGTTCAAGTGATTTTTGTGCCTCAGCCT. 
CCCAAGTAGCTGGGATTACAGGCATGTGCCACCACACCTGGACAATTTTTGT 
ATCTTTAGTAGAGATGGGGTTTCACCAGGTTGGCCAGACTGGTCTTGAACTCC 
TGACCTCAGGTTATCCACCTGCCTTGGTCTCCCAAAGTGCTGGGATTATAGGT 

..GTGTGCCAGCACCCCAGCCAAGTTCTCATTGTCTTGATCTCTCTGATAGCTTC 
AAAAACCTGCCTCCCTCCCCATGCCCCAGGTTTTATATATTGTTCTCAGCAGA 
ATGATTGTTCTAAATCAAGCAGCTATCAAAACTGAAAGTGGAATCCTAGAGC 

■ AATCATTGTAGAAATGATGATAGAATGTrCATGTTACTCTTTCACTTAAAACA 
TTGCAATGAGTGCCATTGCCTGTAGAGTAAAATTCCAACTCATGCCTGGTCCC 
ACAAAAAGCCCCCTGCCTATCTTTCATCTTGTAATCACTGTCTTGCTCACTGT • 
GCCTCAGTCAGCTGGTGTTTAGTCTCTGTTTTCCTCAAGAACATTITGTITCT^ 
AGATCTCAGTTCAAATGTCAtTTATTCCCAAAGGTTTTCTCTGACCACTTG 
GTGTAGTTCCTCCCTCTTCACCCTGTTAATAGCACATCATCTTGTITTACTTTC 
CTTA(^GGACTITCTACTAGTGGAAATTAITATATTTATTAATTT<}TTrATTGT • 
ATTTATCTCTCGATTAGAGCATAAACTATATGTTAGCAGGAATCTTATCTACT 
TTATTTTTAAAAAATAAirrC AACTTTTATTTTAGATACCTGT^ • 
ACTTGGGTAAACTCTTGCCTGTCTCCCTTCACCCTCTGGTAGTCCCCAGTGTCT 
ATTGTTCCCTTCTTTGTGTCCATAAGTAACCAACATTTAGCTCCTACTTATAAG 
TGAGAACATGTGGTATITGGTTTTCTGTGCCTGCATTAACTTGCTTGGATAAT 
GGCCTCCAGCTGCATCCATGTTGCrGAAAAGGACATGATTTTGTTCTrTTTTC 
ATGGCTGCATGGTATTCCATGGTGTATATGTACCACATTTTCTTTATCCAGTCC 
ACCATTGATGGGCACCrAGGTTGATrGCATGTCTTTGCTGTTGGGAATAGTGC 

■ TGCGATGAACATATAAGTGCACGTGTCTTTTTGGTAGAACAATTTATTTTCCT • 
TTGGATGTATACCCAGTAGTGGGATTGCTGGGTCAAATGGTTTATCTACCTTA 
TTTATGGCTGACTGTCCAATCACAAGAGCAGTGCCTGGGACATCATGCAGAA 
TTGAATGAATAAACATGACCTTCAAGTATTTTAGTATATAGTCAAGGTGAAGT 
TTTGCmCAATTCCTGATCTGATAAGAtAAAATTAGGACAGCAGGTTATGAC 
ATTAAAAACATTATrAAGTGfrATATTTTGACCTTGCCATGGGAATCACTGGT 
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ATATACTGGTAGTTGCTAGTGGCATTACATGAAATCn'ACCGTrGTGTTm 
.CTCTCTTAGGACTCCACTGATGGATAACTGGTATrCTTTATCATATCTGTACTT • 
•GAGCACTGTTGGAACTTr.GGTAACATTAmGTGGGGATACTTGTCAGTTrAT " 
CAACAGGTAACTATCTAAACATTAGTATTGTTGTAmACCTCrATATCAGTT . 
TITACTGTATCTGTTTTATAGCTATTTATGTTATmACTATGATCCTATTGC^ ' 

aactagtttaattatitcttctgttttactcatgggAaatctga^^ 

CTTGAAAGCAGAAGACTCATTrCTATTrGAATGTTGGTTCTGCCCmAGG^ ' 
CTGCATGAACrTGGGCAAGTCACATAGCTACACTGAGCCTCAGCTTTOT • 
TATAAAATGAGGACAATAATAATCGACCTGCCCATCTCACAAGTTGCTGTAA^ 
. . GACAAAAATAAGATGCTTGAAAAGAACTTCATAACTGCTATGATGtTAGGGA • 
TTTTACACAtATTmATAATAACCCAATATGCTGTGATCTTTACATGGTCCAG 
TATTCATTCTtTGTGAACTGTAAAGAAGCATAeACAGGTTATATCTAAACATA • 
GTGTGGAGGCTTTTTTTTCAGATAmATCATTTm • 

TTAAAAAATCAAATTGATACACATAGTATATAAATTGCTCTCrGGGATTGCTA 

• TAGAACGGTTATGTTGGGACrCCATTlTGTCCTCATTCTATGAATTCrrCT^ • 
CCCTGTTCCTGTGTrGGGTTTTCTACTGCCTATATTCATTACTTCTCATTCTCAT - 
TTACTCCCTTAATTGAATGAAACACATCCTATAGCATATTCTCAAGAAAGAGT 
AGACAAAAGAAGAGGTTTTGGAAACTOGTGTGTCTGAAAAAAAAATTATTr 
AATCTAXCTTATGCTTGATTGATTGTTTGAGTAGAAATTGCTAGGGTTGAAAA' 

• GAACirrTCTTTTTTCTGAGAGGATGTCTGAATTTm 

GCTGTTGTGACAGCGCTCTGCTACCAGCTGACACTTTGTGTGTGTCCCATnr 
CACAATGACTGCCTTGGCCTAGGCCTTCAGTAGGGCCCATTCAATCTGAAGAT 
CCATGCCCTTCAGTTCTGGACATTITCTTTTTTGTTTTCTTTT ' 

. TrCTATTTTTAATAATTTTCTGTGTTCTCTCTCTGAGAACTCCTATTAGTCAAA 
CATTTGAATCAGAATCATGGACAGtrCCTCTAATTTTCTCGTTTTITI^ 

• TATTirrCCTCTCCGTGTCTTTTAGGTCTACTATGTGAGATATTITrTGAOT 

. TirrCCAATAATTCTCAAAGGAGTGCTCTGATCCAATTTTGAGAATAGTCTGT 
AGGGGAACGAGGGCAGGGAAGAGGCTATTGCAGTAACCCAGGTGAGAGATC 
ACAGTGCTGGGGCCGGGGTAGTGCGGAGTTGGTGAAACATGCTGGGCTCCCT 
GACTCCGTACATCCCCTCACATGTATGCATCCATAGCAGCCACGCCAGGGGA 
AGGGTGCAACTGCCTCCTTCAAGATGTGTTACAACTAGTTGTGTTCAAGGATG 

. TCTGATGTCmGirrGCTCCATGCnrACAAAGCTGCCACkiACTGtGCCACm 
ACAGAATCCTGCTATCATITrCTGTTTTTTGGCACTGAGATAGCACTGTCTC^^ 
AAACTrCTGTGACACTrGTATTTATAATTATAAAATTTGTTGCTATTCATTATT 
TCTCAGTTGTTGGAAAATATTCTTTCTTTTCCAAAAGGACTGTGCrCTTTGTCA 

. AAAATAAAATTCCCAGTACAGTAGGTTtATTAATrTGTTTGGTCCAGTTTGAT 
ATATTTTCAGACAGATTTTTGTTGTTGTTGTTGTTGCTGTTGAGATGGAATCTC 
GCTCrtTTGCCCAGGCTGGAGTACAGTGGTGCGATCTTGGGTCACTGTAACCT 
. CTGCCTCCTGCGTTCAAGGGATTCTTCTGCCTCAGCTTCCCGAGTAGCTGGGA 
^ CTACAiGGCACGCACCACCACGCCCGGCrAATTTTTGTAATTTTAATAGAGATG. 

• GGGTTTCACCATATTGGCCAGGCTGGTCTCGAACTCCTGACCTCAGGTGATCT 
GCCCACCTTGGTCTCCCAAATTGTAGGGATTACAGGCGTGAGCCACCACCAT . 

. GCCCAGTCTTAAGACAGAmtCATCTTTTCCCAGCAGTTATTCAGTTGTrCA 
. GATCTGGAATACACCTAACCAGTCrCCCTGTACTATTTGCCCTTTGGTCCTTAT 
mGCTC ATTTTTTAAAAGAAAGTGGAGTATCAGAGTAGCTCCTTTGAATACT 
.CCrmrrTCTmAAATGCTTrGATTCTGAAJS^C 
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.ttgtttcaagaagtaacact'ctacctcagtgactgggatccAgtaata^ • 

ACACITCAACTTCATTAACTTCACAAATAACTTATTCCTACnTCACAGAAGCA' 
GTGAGTACCTTGAAAACTATGAGAAGCACACAATTTTGAtGCTCCCTGGGAA ' 

acagattttaaaacctatcgtagtatactaaaactcccataagagttcaggc 

CTTACAGACTTGGTTGGCGTITCAGCACTCCCCTGATTITTACTATGCAG • 

•GCTATGTTTTCATCAGGAAAATGGGGGCAAATAGTTTCCTAAGTAGATGACA : 
AACTGTGGGCACAGAGATTGTTTTGTTTCTTTGTTTCTCTAGCACCTAGCAG 

•atctccagcacacagtaggtActcaataacattgaacgaattcatttaaaatt 
gattctatctccagaacagcagaggttcccataattaaaagrcragtatttgt" 

ACTAAAGtAGTGGTtCTTAAACirrAGAGGGCATAAGGATCCTCtTGGAAATT " 
TTATAAAAATCAGGCITCAGGGGTCCTACCtGCAGAGGTCCTAATTTGGTCAG 
TCCAGGCTAGAGTCTGGGAATCTGCATmAAGTACATTCCTTGAATGATTTG " 
• GGGAAGGGTAGTTTGAGGACCACTCtTrGAGATACACAATTTTTAAAAC^ ' 
CCTCTTTGATCCACAAAAAATACCAAAGCAAAATAGAATITTTTl^^ • 

;ttgtaaagamaccttagggaagaggatttggatcaaacgtcagtcagcata-. 

ctaattttcacitaagtaatttattcagcagttccgAattgcctgca^ 

atagacatgtatatttgtagccaacaaagtggggaaaagcagctccactgtc 

tgaagcggggcagatggtttgatatmactgatggcatggagggtgctttta. 

aaccagttttcctacgagcattgggccagatgactgtttctctagttgagtac . 

cagatgaagcagttggcttgcgtttaagctctatcrcacacacatatatatga 

tacatatatatatatatatatatatatatatatatatagagagagagagaga 

gagagagagagagagagagagagagagagagagagagagagagagagact 

CATATTTATTATAGTGAGAGGCTITCAAGACCrGGGGCTAATTAAGGAAAGG ' • 
. GGAATTGCGGGCTAAGTGATCAGTGCTITrAAGTTTCCCTTTC^ 
TTTGAATGTATGGAGTTGAACGTGAACAAGTTAAATGCCTGTATAATGGAAT 
GTCTCTGTGTAGTTACTGGTGTCCTTriTAACAACGTAAGTCATGTACATTm 
TmCCAGGAGGAAGAAAACAGAACTTAGACCCCAGAtATATACTAACCAAA 
GAGGACTTTTrATCCAATTTTGATATITTTAAGAAAGTGAGTTGGCm 
ACCnTGAGITAGGAAACrGGGCrrTATTACCTGGATAGAACACT 
TCAACCTCTTTCTTGAAAAGTGATGGACAAGGAAGGTATAGACCGTATATAA 
TCTGATGATCtATAtATGTTGGATAGCTCTCTATCCTATGATGATCTATATCTG 
TTirATAGCTCCATATCCTCTGATACGCTATACCTATTGCTTATCTGTGTATCC 
. TCTGATGAGCTATATCTGTTGCATATCCTCnTrCCTTTGATGAtCm • 

■ TGTAf ATCTCTCTATCCrCTGATAACCTATATCTGTTGtATATCCCTCTATTCt • 
CTGATGAGCTACACCTGTTGTATATCTATCTCTCTCTATCCTCTGAtAACCTAT 
ATCTGTTGTATATCCCTCTATTCTGTGATGAGCTATACCnGTTGTATATCrCTC 
TATATCTGTCTATCCTCTGATAATCTAfATCTATTGTATCTCTCTCTATATCTG ■ 
TCTATCCTCTGATAACCTGTATCTGTTGTATATCTCTCTATATCTGTCTATCCT 
CTGATAACTTATATCTGTTATATCn'CTCTATATATGTGTCTATCCTCTGATATT- 
ATAAATCTGTTGTATATCrATrGTCTGATGGACTATTTTGTATCTATCTTCTGA 
TAACCTGTATCTGTTGAATATCTCGGTATCCTCTGAGGAAGTATACCTGTTAT 
ATATTTCTCTCCCTCAGTAGATTAGAAAGCTGATGCAGAGAAATAAAAATAG 

■ TAGAAACAATTATTTAGAATTACATGAATGAAGAGCnTCrTTTTCrrC^^ 
ATCACCATGTTAACATTTTCTTTTAGAAGAAGCATGTiTTGAGCTATAAATCA 
CATCCAGTGGAAGATGGTGGAACTGATAATCCTGCTTTCAACCACATTGAATT 
GAACTCAGATCAGAGTGGCAAGAGCAATGGGACTCGTTTGTGAAGCTGCTCT 
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GATACTAGATATCCnTAAATGATGTirCAATTTTAtAtGTmCTA^^ 
GGATCAGGTTTTCTTTGTGTGTGTGTGTGTGTTGTATCATGAGTGTTTGGGGG ' • 
ATAAGTTTTTGTTAAAACAAAGTCTGGACTATOTCATTTACTACATCATTAA • 
•TTGATGTTACTCTGGAGTTtAGAATTCTGGCATTGACATTrCCCTCTCOT • " 
.TTATTTCGATGAAGCTATAATTGTGAAAATTGTAACTACATAGATGCTGAAAG ' 
•. GCTAATACACACATATGCACATGTATTTGATTGTCAAAGGTATATTCTTAAAT' ' 

• .TTGGGTATTATTGAAAATATTTTCCATGCCTTGGTGCTAGCATATAAGTTTGG. • " 
• . AAGTTTGCC AACATCACAATTCATCTTGAAAAGAGCTTTmCCCT^ 

• CAtACACCATTCTTAGGGAGCAATGAGGTAAGAGGTCTGTGTTGTCTAGATCT ' ' 
TTGCTTTTTAtCCCCCTATCAGTCCAGGGCATATACTAACCTGCAAACTGAT^ • 
CTGAATCAGGAAGGTGGTAATCAATAAGTATTCTGGCTGGGAAAGACCGTGG 
GCCCAATGATCAAAGTCITCTTGGTGCTGTTCATrAATTCTtGTGC(nTrrGGC 

• rrGTrTTCniAGAGTTTCTGGGCITrGGCTGCTGATACTGCcnritOT^ ' 
AATTTTTATCTGCATGCCCAGTTTCTGACCTATCAACTTGGGTTITATrGTC ' 
CTCTAACTGAGCrrGTCTTCATAATTTTCTGmATTGCCCTGGGOT • 

. • GTCTCAAGACACTCATGTGAATCATGCCACCCCAAATCCTGGCTTATCAAGTC 
CCAGACTATAAATTATGAACTCCCATTAGCrrGGTACTAACATATACTTGATG 
TAGGTATTTATGGACTTGATGATCCAAGAATATTATATTCTTCAAAATGGTTA 
AGCTCCATGGAGTTAGATGACTA:CACTTAATGCrATrAAGTTGAACrnTGAA 

. TGTCAACTAATTTGCAATCAATTAAAGATACATATGCCTAGAAATnTGAAAT 
TrCGGTATATTTATCCAGTTAAAGGGCTAAATTATATAAGCAAACACrACTTT 
TTTTAAj^ACGTCTGGACTCAAAAAATGCTTTGTTCCATGTTTTAAAAIT^ 
AGTAGCAGTCTCAAAGTTGCrrAGCTGTTTATTTTGCTATGTTCCTAGCrAAG- 
AGTTTGGTTATAGGAGTTCATCAATAACnTATTTTTTGTACAGTTCCCACATTA ' 
GATACTGTTTAAAAGlTCTmTrAAACTCAATITITm 
AAATATTTAGATACATACAAATGTTTiTATGATTAAATAATTTTATGOT 
TCTGATACGTGTTATTTAGGTAATCATGCCCTGTACATTTAGAGGTTGCTAAC 
TGACAATGTTAAGAAATTTTAAAAAAAAAAAAAAGCCTGGGCATGAtGGCTC 
ATGCTTGTAATCTGAACATTTGGGAGGCTGAGGCAGGAAGATCGCTTGAGGT 
CCAGAGTTTAAGTCCAGCCTGGAAACATAGTTAGACCTCATCTCrACAAAAA 
.TAAAAAATAAAAATAAAAAAAACTTAGCTAGGCATGTTGCCACATACTTGTA . . 
.GTCCCAGTTATTAGGGAAGCTGAGGTGGGAGGATAGGTTAAGCCCAGGATTT 
• CAAGGCrGCATTGAGCTAtGATTACACCACTGCACTCCAGCCTGGGTAACAG 
. AGTGAAATCTTGTCTCTGGGAAAAAAAAAAAAAAAAAAAAAAAGAGAGAGA 
GAGAGAGGGAGATTTATATAACATTTAAATTAACTGCATAAACCTGGGCAAT • 

. GTTCAAAACTCCATCTCTACAAAAAAACACAAGAATTAGCCAGGCACGGTGG 
TGTGTGTCTGTAGTCTCAGCTACTTAGGAGGCTGAGGTGGGAAGATTGCTAG 

. agccaggaggtcgaggctgcactgagctgtgattgcgctactgtactccacc 
ctgggtgatgaagccctaactcaataaataaataaataaataaataatacaa 
ataaattaactatatatoatgtatttot 

, ttcctgcttttctttititit attt^^ 

gcgcacattgtgcaggttagttacatatgtatacatgtgacatgctggtgcgc 

tgcacccactaactcgtcatctagcattaggtatatctcccaatgctatccct 

.accccctccccccaccctaccacagtccccagagtggatattccccttcctgt 

gtccatgtgatctcattgttcaattcccacctatgagtgagaatAtgcggtgt 

ttggtttltrg^ttcitgccatagtttactgagaatgatgatttc 
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CATGTCCCTACAAAGGACATGAACTCATCATTXmATGGCTGCATAGTATTC. 

catggtgtataagtgccacattttcttaatccagtctatcattgttggacattt' 
gggttggttccaagtctttgctattgtgaataatgctgcaataaacatacgtg' 

TGCATGTGTCTTTATAGCAGCATGAtTTATAGTGCTTTGGGTATATAGCGA'GT ' 

aatgggatggctgggtcaaatggtatttctagttctagatccctgaggaatc 

• gccacactgacttccacagtggttgaactagtttacagtcccaccaacagtgt 

AAAAGTGTTCCTGTTTCTCCACATCCTCCCGAGCACCTGTTGTTrCCTGACtTT- ' ' 
TTAATCATTGCCATTCTAACtGGTGTGAGATGGTATCTCATrGTGGrrrrGATT 
TGCATTTCTCTGATGGCCAGTGATGATGAGCATTTTTTCATGTGTTHnTTGGCT ' 
GCATAAATGTCTrCTTTTGAGAAGTGTCTGTTCATGTCCTrCGCCCACTTTTrG ' 
. ATGGGGlTGmGTITITITCTTGTAAATTTGTTTGAGTTCATTGTAGATTCT^ 
GATATTAGCCCTrrGTCAGATGAGTAGGTTGCAAAMTTTTCTCCCATTTTGC 
AGGTTGCCTGTrCACTCTOATGGTAGTTTCTTTTGCrrGTGCAGAAGCT(^ 

• TrTAATTAGATCCCATTTGTCAATTTTGTCTTTTGTTGCGATTGCTTITGGTGTT • 
TTGGACATGAAGTCCTTGCCCATGCCTATGTCCTGAATGGTAATGCCTAGGTr " ' 
TTCTTCTAGAGTTTTTATGGTTTTAGGTCTAACGTTTAAGTCnTrAATC^^ 
TGAATTGATTTTTGTATAAGGTGTAAGGAAGGGATCCAGTTTCAGCTITCTAC' 

. ATATGGCTAGCCAGTTTTCCCAGCACCATTTATTAAATAGGGAATCCTTTCCC 
CATTGCTTGTTTTTCTCAGGTTTGTCAAAGATCAGATAGTTGTAGATATGAGG " 
CGTTATTTGTGAGGGCTCTGTTCTGTTCCATTGATCTATATCTCTGTTITGGTA 
CCAGTACCATGCTGTTTTGGTTACTGTAGCCTTGTAGTATAGTTTGAAGTGAG 
GTAGCGTGATGCCTCCAGCTTTGTTCTTTTGGCTTAAGATTGCCTTGGCAATG 
CGGGCTCTTlTTTGGTTCCATATGAACTrTAAAGTAGTTriTrCCAATrCT^ ' 
AAGAAAGTCATTGGTAGCTTTATGGGGATAGCATTGAATCTGTAAATTACCrT • 
GGGCAGTATGGCCATTTTCACGATATXGATTCTTCTTACCCATGAGCATGGAA 
TGTTCTTCCATlTGTTTGTATCCTCTmATTrCCTrGAGCAGTGGTITGTA^ 
CTCCTTGAAGAGGTCCTTCACATCCCTTGTAAGTTGGATTCCTAGGTATTTTAT 
TCTCllTGAAGCAATTGTGAATGGGAGTTCACTCATGATTTGGCTCTCTGTrT • 
•GtCTGTTGTTGGTGTATAAGAATGCTTGtGATtlTGGTACATTCATTTTG 
CTGAGACTCTGCTGAAGTTGCTTATCAGCTTAAGGAGATtlTGGGCTGAGTCA 

• • ATGGGGTTTTCTAGATATACAATCATGTeGTCTGCAAACAGGGACAATTTGAC 

TrCGTCTmCCTAATTGAATACCCTTTATTTCCTTCTCCTGCCTGATTGCCCTG 

ggcagaacttccaacactatgttgaataggagcggtgagagagggcatccct 

• gtcttgtgcgagtmcaaagggmtgcttggagtttttgggtattgagtatg 
atattckk:tgtaggtgtgtcatagatagctcttattattttgaaatacatccc 
atcmtacctaatttattgagagtttttagcatgaagggttgttgaattt^ 

GAAAGGCTTTTTGTGGATGTATTGAGATAATCATGTGGTTTTTGTCTTTGGCT^ 

tgtttatatactggattagatttattgatttgcatatattgaagcagccttgca 
tcccagggatgaaggggagttgatcatggtggataagctttttgatgtgctgc" 
tggattggttttgggagtattttattgaggalttrtgcatcaatgtrcatcaa 

• gatattggtgtaaaattgtgtcttttggttgtgtctcrggggggctttgttatc . 

. AGAATGATGCTGGGGTCATAAAATGAGTTAGGGAGGATTCCCTGTTTTTCTAT • 

tgattggaatagtttgagaaggaatggtaccagttgctccttgfacctctggt 
•agaattggggtgtgaatggatgtggtcctggactgtttttggttggtaaagta 
ttgataattgcgagaatttgagctcctgttattggtgtattgagagattgaag. 
ttcttcctggtttagtcrrgggagagtgtatgtgtggaggaatttttccatttc 
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TTCTAGATTTTCTAGTTTATTTGCGTAGAGTTGTiTGTAGT^ 

AGTTTGTAmCtGTGGGATCGGTGGTGATATCCCCTTTATCATrTTTTATrGT 
GTCTAtrTGAtrCTTCTCT(nTITrTrCTTTA^ 

ATTTTGTTGATCCTTTCAAAAAACCAGCTCCTGGATTCATTAATriTrGG^^ 
GGTTTTTTGTGTCTCTATTTCOTCAGTrCrGCT^ 

CTTCTGCTAGCTmGAATGTGTTTGCTCTtGCTmCTAGll 
mTTATACTCTAAGTTTTAGGGTACATGTGCACATTGTGCAGGTTAGTTACA 
TATGTATACATGTGACATGCTGGTGCGCTi3CAeCCACCAACGTGTCATCTAGC 
ATTAGGTATATCTCCCAATGCTATCCCTCCCCCCTCCCCCGACCCCACCACAG 
TCCCCAGAGTGTGATATTCCCCTTCCTGTGTCCATGTGATCTCATTGTTCAATT 
• CCCACCTATGAGTGAGAATATGTGGTGTTTGGTTTTTTGTTCrrGCGATAGTIT • 
ACTGAGAATGATGGTTTCCAATTTCATCCATGtCCCTACAAAGGACATGAACT 

catcattttttatggctgtatAgtattccatggtgtatatgtgccacatitfct 

TAATCCAGTCTATCATTGTTGGACATTXGGGTTGGTTCCAAGTCmGCTATTG 

tgaatAgtgccgcaataaacatacgtgtqcatgtgtctttatagcagcatgat 

TTATAGTCCTTTGGGTATATACCCAGTAATGGGATGGCTGGGTCAAATGGTAT 

TrCTAGTTCTAGATCCCTGAGGAATCGCCACACTGACTTCCACAATGGTTGAA 

CTAGtlTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCT 

CTCCAGCACCTGTTGTTTCCTGACTTTTrAATGATTGCCATTCrAACTGGT^^ 

AGATGATATCTCATAGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGA 

TGAGCATTTCTTCATGTGTTmTGGCTGCATAAATGTCTTCTTTT 

TCTGTTCATGTCCTTCCKICCACTTmGATGGGGTTGTTTG 

ATTTGTTTGAGTTCATTGTAGATTCTGGATATTAGCCCTTTGTCAGATGAGTA 

GGTTGCGAAAATTTTCTCCCATGTTGTAGGTTGCCTGTTCACTCTGATGGTAG 

TirCTTTTGCTGTGCAGAAGCTCmAGTTTAATTAGATCCCATTT 

TGGCTTTTGTTGCCATTGCTTrTGGTGTTTTGGACATGAAGTCCTTGCCCACG^ 

CTATGTCCTGAATGGTAATGCCTAGGTTTTCTTCTAGGGTTTTTATGGT^ 

GTCTMCGTTrAA.GTCTTTAATCCATCTTGAATTGATITrTGTATAAGGTGTAA 

GGAAGGGATCCAGTTTCAGCTTTCTACATATGGCTAGCCAGTTTTCCCAGCAe 

CATTTATTAAATAGGGAATCCmCCCCATTGCTTGTITITCTCAGGm 

AAGATCAGATAGTTGTAGATATGCGGCATTATTTCTGAGGGCTCTGTTCTGTT 

CCATTGATCTATATCTCTGTmGGTACCAGTACCATGCTGTTTTGGTTACTG^ 

AGCCTTGTAGTATAGTTTGAAGTCAGGTAGTGTGATGCCTCCAGCTTTGTTCT 

TTTGGCITAGGATTGAOTGGCGATGTGGGCT(nTITTTGGTO 

. TrAAAGTAGTTTrETTCTAATTCTGTGAAGAAAGTCATTGGTAGCTTGA^ 
ATGGGATTGAATCTGTAAATTACCITGGGCLVGTAtGGCCATTTTCACGATATT 
GATTCm'CCTACCCATAAGCATGGAATGTTCTrCCATtrGTITGTGTCCTCTTT 
TATTTCCTTGAGCAGTGGTTTGTAGTTCTCCTTGAAGAGGTCCTTCACATCCCr 
TGTAAGTTGGATTCCTAGGTATTTrATTCTCTTTGAAGCAATTGTGAATGGGA 

. GTTCACTCATGATITGGCTCTCTGTTTGTCTGTTGTTGGTGTATAAGAATGCrT 
GTGATTTTGGT ACATT GATTTrGTATCCTGAGACTTTGCTGAAGTTGCTTATCA 
GCITAAGGAGATmGGGCTGAGTCAATGGGGTTTTCTAGATATACAATeATG 
TCGTCTGCAAACAGGGACAATTTGACTTCCTCTTTTCCrAATTGAATACCCT^ 
ATTTCCTrCTCCTGCCTGATTGCCCTGGCCAGAACTTCCAACACTATGTTGAA 
TAGGAGCGGTGAGAGAGGGCATCCCTGTGTTGTGCCAGTTTTCAAAGGGAAT 
GCTTCCAGTTmCK^CCATTCAGTATGATATTGGCTGTGGGTTTGTCATAGAT 
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AGCrCTTATTATmOAAATACGTCGCATCAATACCTAATTTATTGAGAGT^ 

■ TAGCAtGAAGGGTTGTTGAATTTTGTCAAAGGCnTmCTGCATCTATTGAGA 
. TAATCATGtGGTTmGTCTTTGGCTCTGTTTATATGCTGGATTACAm 

• ATTTGCGTATATTGAACCAGCCTTGCATCCCAGGGATGAAGCCCACTTGAtCA 
TGGTGGATAAGCTITrrGATGTGCTGCTGGATTCGGTTTGCCAGTATTTrATrG 

: :AGGATTTTTGCATCAATGTTCATCAATGATATrGGTCTAAAArrCrClT^^ 
. GTTGTGTCTCTGCCT.GGCITrGGnrATCAGAATGATGCrGGCCrCATA^ 
GTTAGGGAGGATTCCCTCTTTTTCTATTGATTGOAATAGtTTCAGAAGGAATG 

• GTACCAGTTCCTCCTTGTACGTCTGGTAGAATTCG(krrGTGAATCCATCTGGT 
CCfGGACTGTTTTTGGTTGGTAAACTATTGATTATTGCCACAATTTCAGCTCCT 

' GTTATTGGTCTATTCAGAGATTCAACTTCTTCCTGGTTTAGTCTTGGGAGAGT • 

• GTATGTGTCGAGGAATGTATCCAAlTCTTCTAGATTrrCTAGTTTATTrGCGTA 
GAGTTGTdTGTAGTATTCTCTGATGGTAGTTTGTATTTCTGTCSGGATCGGTGGT 
GATATCCCCTTTATCATTTTrTATTGtGTCTATTTGATTOTCTCT 
.TTrATTAGTCTrGCTAGCGGTCTATCAATTn"OT^ .. 

• .CTCCTGGATTCATtGATTTTTGGAAGGGTTTTTTGTGTCT 

. TCTGCTCTGATmAGTTATITCTTGCCTTCTGCTAGCTTTTGAATGTGm 

. CTTGCTTTTCTAGTTTTTTTAATTGTGCTGTTAGGGTGTC 

CTGCTrrCTCTTGtGGGCATTTAGTGCTGTAAATTTCCCTCrACAC^^ 

GAATGCGTCCCAGAGAtTCTGGTATGTTGTGTCTTrGTTCrCGTTGGlTTCAA 

AGAACATCTTTATrTCTGCCTfCATTTCGTTATGTATCCAGTAGTCATTCAGGA 

GCAGGTTGTTCAGTTTCCATGTAGTTGAGCGGTTTTGAGTGAGATTCTTAATC 

CTGAGTTCTAGTTrGATTGCACTGTGGTCTGAGAGATAGTTrGTTATAATCTC 

TGTTCTtlTACATTTGCrGAGGAGAGCmACTTCCAAGTATGTGGTCAATm ' • 

.GGAATAGGTGTGGTGTGATGCTGAAAAAAATGTATATTCTGTTGATTTGGGG 
TGGAGAGTrCTGTAGATGTCTATTAGGTCCGCTTGGTGCAGAGTTGAGTrCAA 
TTCCtGGGTATCCrrGTTCAClTCCTGTCTCGTTGATCTGTCTAATGTTGACAG 
TGGGGTGTTAAAGTCTCCCATTATTi^TGTGTGGGAGTCTAAGTCTmTGTA 
GGTCACTGAGGACTTGCTTTATGAATCTGGGTGCTCCTGtATTGGGTGTATAT ' 

■■ ATATTTAGGATAGTTAGCTCTrCTTGTTGAATTGATCCCTTTACCATTA 
CGGCCrrCTTTGTCT.CTrrTGATCOTGrrGGTTrAAAGTCTGTTTTATCCGA • 

. ACTAGGATTGCAACCCCTGCCTTTITrTGTlTrCCATTTGCTTG 

■ CTTCATCCTTITATTTTGAGCCTATGTGTGTCrCTGCACGTGAGATGGGTTTCC 

• TGAATACAGCACACTGATGGGTCTTGACTCTITATCCAATTTGCCAGCCTGTG 
TCnriTAATTGGAGCATTTAATCCATTTACATTTAAAGTTAATAT^^ 
TGAATTTGATCCTGTCATTATGATGTTAGCTGGTGAtTTTGCTCGTTAGTTAAT 
GCAGTTTCTTCCTAGTCTCQATGGTCTTTACATGTTGGCATGATTTTGCAC^^ 
GCTGGTACCGGTTGTTCCrriTCCATGTTTAGCGCTtCCTTCAGGAGCT^ . 

..GGGCAGGCCTGGTGGTGACAAAATCTCTCAGCATTTGCITGTCTGTAAAGTAT 
TTTATTTCrrCTTCACITATGAAGCTTAGTITGGCT 
TGAAAATTCrmcmAAGAATGTTGAATATTGGCCCCCACTCTOT 
TATAGGGTTTCTGCCGAGAGATCTGCTGTTAGTCTGATGGGCtTCCCTTTGAG 
GGTAACCTGACCTITCTCTCTGGCTGCCCTTAACATTTmcCTT • 
TTTGGTGAATCTGACAATTATGTGTCTTGGAGTTGCTCTTCITGAGGAGTAt^ 

• mGTGGCGTrCTCTGTATTTCCTGAATCTGAACGTTGGCCTGCCTTGCTAGAT • 
. TGGGGAAGTTCTCCTGGATAATATCCTGCAGAGTGTTTTCCAACTTGGTTCCA " 
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TrcrCCCCATGACTTTCCGGTACAGCGATCAGACGtAGATTTGQ^ •■. 
• ATAGTCCCATATTTC[TGGAGG.CTTTGCTCATTTCTTm 
AACITCCCTTCTCGCTfCATrrCATTCATTTCATCTTCCATCGCTGATAC^ ■ 
TCTTCCAGTT GATCG CATTGGCTCCTGAGGCTTCTGCATTCiTCACATAGTTCT ' 
CGAGCOTCSGTTTTCAGCTCCATCAGCTCCnTrAAGCACtTCT 
ATrCTAGTTATACATTOTCTAAATTTTmCAAAGTTTT^ • 
TGG^mGAATGTGCTCCCATAGCTCAGAGTAATTTGATCGTCTGAAGCCTTCT 
TCTCTC AGCTCGTCAAAGTCATTCTCCATCC AGCTTTGtTCCGTTGCT^^ . 
GAACTGCGTTCCTTTGGAGGAGGAGAGGCGCTCTGCGTTTTAAAGTTTCCAGT 
. TTTTCTGTTCTGmTTTCeCCATCTTTGTGGtm . 
TGATGGTGATGTACAGATGGGritTTGGTGTGGATGTCCmcrGTTTGCT^ 
TH'CCITCTAACAGACAGGACCCTCAGCrGCAGGTCrGT^ 
CTGTGAGGTGTeAGTGTGCCGCTGCTGGGGGGTGCCreeCAGTTAGGCTGCTC ' 
■AGGGGTGAGGGACGCACTTGAGGAGGGAGTCTGCCCGTFCTCAGATCTCCAG 
CTGGGTGGTGGGAGAACCACrGCrCTGTAGAAAGCTGTCAGACAGGGACATT 
TAAGTCTGCAGAGGTTACrGCtGTCTTTTTGTTTGTCTGTGCCCTGCCeCCAGA 
GGTGGAGCCTACAGAGGCAGGCAGGCCTCCTTGAGCTGTGGTGGGCTCCACC ' 
CAGTTGGAGCTTGCCGGCrGTTTTGTTTACGTAATCAAGGCTGGGGAATGGGG 
GGGGCGGGTGCGGGAGCCrCGCTGGCGCCTTGCAGTTTGATCTGAGAGTGCTG 
TGGTAGGAATGAGCGAGATTGGGTGGGCGTAGQACCGTGCGAGCCAGGTGGA 
GGATATAATGTGGTGGTGGGGGGTTnTTAAGCGGGTGCGAAAAGGGCAATAT 
TGGGGTGGGAGTGAGGTGATTATGGAGGTGGGTGTGTGAGGCGTiTnTTGAC 
TGGGAAAGGGAACTGCCTGTGCGCtTGGGCTTGCGAAGTGAGAGAATGGCTC 
GCGCTGGTTGGGGTTGGGGATGGTGGACGCAGGGACTGACCCGGGGGGACTGT 
GTGGGAGTCGGTAGTGAGATGAACCCtGTAGGTGAGATGGAAATGCAGAAAT 
CACCTGTGTTCn'GCGTGGGTCAajGTGGGAiGCTGTATACGGGAGCTGTTCCTA 
TITGGGGATG]TGGGTGCTGCGCGGGTATTGGTTGTTTTCITGCA 
TTAGTTAAAGGTGGGGGTGGGTGATGGTGTAGGGGGGTTCTGAGGCGATCGG. 
GGAGTGTGCGTCTTCAGCGGCTAAGGGGAGAAGATCTGGGAAGGAGTCAGTC 
AGAGAGGCTTGGGGCAGAGTTCGAGGGCCTGTGGGAGTGGCTGCCAGGTGAG 
TTGAACAGTCCGATTTrCAGTGGGGTGCCACACAGAtGGGACATGGC^ 
AGGAATCCCAGGGTGTGGGCATTGCTTGGGGCAGTGGCGAGATTCGATATAT 
GTTATTTTTAAATGACrGTAtTTGTAAGCAAATATCAAATTTAGG^ 
TCTACAATGTTTTAATAAGTAGAAAGATATGTTTGTTTTACATGAATGTGm 
TGAAGTATGGTTATTTGTTTAATAATTCTAAATGCATAtGTGTGTAAAATGCT 
TCAATTTTGGAAATCAAAGTCAGGCCATITTTTTGTCTTACCrGATTGCCAGG ' 
GAGTTACGCCATGTATTCITAATGAGAAAGATGATGTTTCGATTGTTGTTCAG 
TITGCTTTAGACAGAATATATTrTTGTGACATTTAGAACTATCAATAm 
TITATAAACAGAGGAGAATGCCTGATAGAATTCTTAAGAAAGCAATGTAAGA 

gtattagctcaaaataatttatcttaattrgtaaattitragataaaaccaaa 

taagggttaaatgttaatgcattgtgagttaaattagataatgtggtagtctt 

agttatttgaatgacaaaaacaggagtgggggaaaaagcatagaagttgtca 

atgtgtgttttgcrgtrgagaagtrgtatagcgtraaattgagccctaatgtcc 

tgtaagaatggtagataggactaagctgctaggtaggtgagagaaataatgtc 

agtagaaaagatgaggttgagttgtttggcagaatagcagtttactaagtga 

aatagtgttactraatamcaatatgattgggaatcaaagtttgagagaaa 
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AGTCATTTGCCAGTTTTAAAAAATAGAGCTGTTAATTTGCAATATCATGATGT 
AGAGATAGTGCCTTCTCTTAAAAATGTGTGTGATGGAAATAGTAAAATATATT- 
TAGGAGTCAGCAGGATTATTCCAACAGAGGGAGTGTAAACTTTAAAGAAAAA 
TATGATTCGGGAGGCTGAGGTGGGTGGATCATGAGGTCAGGAGTTCGAGACC 

agcctggccmcatagtgaaaccccgtctcractaaaaatacaaaaattagc' 

tgggcatggtggcacacacctgtagtcccagctactcgggaggctgAggcag ■ 

gagaatcgcttgaacctgggaggtggaggttgtggtgagccgagatcacacc 

, actgcactccagccrgggcaacagagcgagattccatctcaaaaaatacata . 

■tatatttttgacatatataatatatatatgtcagtaatattcaccccatagaa 

aatgaaaattitatacjGAAAGAtgtaaaAcagcataaattcacattcatot ■ 

attagttgcitatgcaatcattttctctccagatcattggttctcaa^ 

caamxgcctctcaggggatatttggaaatgtctggagacatttttggatgg 

. cactagtggtatctagtaggtagaatttagggaaactggtaaacatctcctg 

aggcctgcaagggtgctctcctcctccacaacaaaaaattatccagcctaag • 

atgtccatagtgtagagttggagaaaccttgccctggagaataaggttgatt 

ttcttgaagtcacacagcctggttgtgtttctatagggaaacagcctgaaaat 

tctatctgaatgttctcatctacaggtaaggatgaaaatgccactggcatatc 

taatatratgatgcagaacaatgacgatgtattttcacagcattatgaaatta 

ttaaggaccatagaattgtgaataattatttaaagaagtcttaggacagttt 

•agattctccacatgccttctaatattgaeacacattaggatgaaggaaatatt 

aaatacatacatgtaaagattttgaattttttttcaactgagcgtccaggata 

TAAATACAAGGAACAGGGAGG GGGTT GAGATGGCGGAAGTAACTCTGTATT 

GATTCTTATAGGAAATTCrGAGTTTTTCCATAAAGACAAAGAGTTrATTGAGT 

ACATGAGCATTTAGTTACTGAAAATTCACTGTATGCTTTrCTAAGTrTTGAGC 

TTATTGTTTATGAAATCCTTGAGAAAGTTGAACATTTGAATGTAAAAACATGG 

TTGTGAATCTGAAiTrrCAACrTGCTGATTAAACrCCCTGCAAGTTTCTTT 

/GTrGTCrGTTTTGGGGGGATAAATGTCAAATrGAATACAGTTAATTTTATCAG 
CCTTTACAAAAAGATACTTCCACCCTATTTACAACATAAAGGACTATTCCTAA . 
GTGCTG TCTG TAGATTACAAAAAGTATAAACAtGTAGAATTTTTGTCACAGA - 
AGACTATirrATTTITAATGAATTAACAGCGTATTGAAAAATAAAAAGTACAA 
AAAAGTACAAACTITrCTGTCCCMTACATTATAAAACGTTTTATTTTAATAG' 
CrtTAGAGGTACAGTTTITGGTTAeATGGGTGAATCGTATAGTGGTGAAGTGT 
GAAGGTTCAGTGCACTGGTCACCTGAGTAGTGTACATGGTGCCCAATAGATA 

.GTTTTTCATTCCTCTTCCTCAGCCTCCCCACCTTCTGAGTCTCTAATGTCC 
ATACCACTCTGTATGCCTTTGTGTACTCATAGCTTAGCTCCCACTTACAAATG 
AAAACATGTGGTATTTGCTITrCCATTCCTGACTTATGTCACTTAGAATAATA 
GCCrCCGGTTCCATCTAAGTTGCrGCAATAGACATTATTTCATTCITTTTTATC 
CCTGAGTAGTACTCCATGGTGTATGTGTATGTATATACATATATATATATATA 
TATCTCACATTTCATATATATACTCACATTTTTTATCCACTCATCAGTTGATAG 
GCACTTAGGTTGATTCCATATCCTTGCAATTGTGAATCGTGCTGCGATAAACA. 

• TGTGCATACAGGTGTCTTTTTGACATAGTGACTTCTTTTCCTTTAGGCAGATAC 
CCAATAGTTGTTCCAATCCAATTmAATTGGGGTAATTTAATCm 
TTGGTCCAAGTTAATrGTTGATAATATCAGGACTTTAAAAGAGAAACAGAAG 
TrCTTAACCTGAGTGTTTTTTCTTrCTTTTGMAAA^ 

AAATTTCTATTTTATATCTCAAAGtTATAGTTTTGCTTGTGGGGTATAAAAT^ 
AAGTGGACAACTAAGACAGAGAACTTAGGTGCCAAAGATGACCATGTTTATA 
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CTCAATCACCCAATTTGGAACCACATC^TCAAAGAAGCAGTTGCCAGTGTTC 
CCCCTAGTGTGAAGTTTCCACTTCTCTCAGTTAAAGCACCTGTCTGTCATCTC 
Am AAAG CACCTACTTACTTCCrACCTATTCAAGTCTTGATTAAGCAAAATG 
CAGATTTTCeATATACAGGAAATTrGGCATAACCITrCAC^^ 
TCAGGTCTCCATCATTTAAATTCATCAAAGAAAGAATATmGAAGTTGT^ 
CITTGTTACrCATTCCCATTTTGCAATCATGTATTGTTATTCCOT • 
AAAAGGCTTCnrrTACCCCTTACCCTTGTTrAGGCTGCACCACCAAAGGTCAT 
.'TGGATATCAATGGATGK3GATTGACrCCn*GGAGCTCCAGACTCACTCACACAT .. 
GTGCATCAAGGAtTCAGGATTCTCTCCATrrcrGCrmC(m'^\:C^ . 
CCAGACCTTTATTCTCTTCn'GTATTAGGATCTGGTCTGTCACTGGGCT^ 
ATTCTATrCTAGAGCTTCTTAAACTTCAGTGTTCATCAGAAGCACCTGGAGGA 
GCTGGTTAAAACACAGATTGCTGGGOTCACCCCAGAGTGTCTGATTTAACA 
GCTCT.CAGGTGGGTCCTGAGAATTTACCTTTCTCAAAAtTTrCCT '' 
tGGTGCTrCTGGTCXGGGAGTCACA(mTGGAAACTACTGTTCCAGTCCACAG 
TTGTCTCTITGGAAACCGAATCTGATGATTCACTCCTCTGC^ . 
. ATGAGTAGAGAATAAAAGCCCCATCCCTTAGCCTGATGGGCTTCTCAGAAGT 
AmATTGGTACCCTCCTTCACATGTTACATAGGCTTGTCACATTGAGCTTCTC' ■ 
ATATGTGCTAAATATACCACCATTTTCTTGCCTTCTTGTTATTTTACATGCTAT 
CCTCTTTGTGTAGGCTACCCATCCTCTGTCTACTTCTCAGATCTTTGAAAAACA 
CCTGCTCAGTTGTTAGAACCCAGCTTACCTATCACTTCTCTAACTCTTGACAC 
ATTGCATGGGTGATCGTGATCTTATACTTACCrCCGGCrCTAGTCATTTTGTTG 
TACTGTACATGTATTAATGTACACAGCTATCTCTTAGGTGGCACATAGtCTCT 
ATTCCTGATGTTTCCATCCAGGTGGATGAAGTGTGCATTAGAGTAACTTTCTG 
GATCTCTCrCrGCCCCTTTCGTGCTTATTCTCCCTATGTAAACAGGAAGTGACT 
TTTGTGATCAGTAAGTCTGAGAGAGGAATCAGACATGTATATCTTAGTCCTTT 
CCACITCCATITCTTITrGGCATCrGCCTGCTTAAAGAATATGCATGATCTATG 
CCTTACAACTCCTTGCTCCCATGATCTCTTTGACCTCTATTACTCCATGCCAGG 
TCTTGCCATTCCTTAAAAACATGTTCCCACCTCACGGTCATGTGCATTGCTTA 
CAACACCCTACrCAATATCCATITGGCTCACTCrTTCAGCTCCTACAGGTCTTT 
. ATTCAGATGTCATCTTCTAGGTGAGGTATrCTCTGATCTCTi^TTTAAAATTGG 
AACmCCTCCGCCATGCAGCCrATCCCCCrXGCITGCTTrATTTCTCTCCCAT - 
CrCtATTAtCATTGAACACACAATATTTTACITGTTrGTTGTATGTCA^^ 
AATAAAATAAAAACTCCAAGAGGTGAGGATTTTTGCTGGTTCTGTTTAGTAAT 

. ttctctAgcagAtgtagAacatggaaggcactcaatacaaattggaatacat 
gcttttggtcatgagataagggitagtgataaaaatagcctgcttccatagg 
gatgcttggggtcttgacaccagccggtgactagatatgtgtaattctcagat 
ttagtgttagggaaactttgttgacttgtagttagtcatgtcttccaatcatc 
.cattaccaataatattagtaatattgtaataaataagagactgatctctacca 
tcacrgagtttattgtctaatacagaaaatgggcaaaatacaagtagttaca 

GTAAAGTGTTGTAACCTGAACACAGATGTGCCTGCTCGCCACTTGAAAACTA 
AAATAAAGAGAGAAGAGAGTTGGTGGGAGGAAACGCAGGTTTATTTGGAGA 
• ACCAGCAGACCAAGAAGATGATAAACTGTTGTCCTAAAGTACCATCTTAAGT 
CAGTACAAATTGCAGArrATTTTTATGTTAAGAACAGGGGGAAGGAAAGGTG 
GGTGGGATCAAGAGGTGACTGACAACTGCAGACATCTGGGCACCAACAAGG 
GTCrGAGGAGGrrGAGAACTTCTATTTCCrrGGTCAGGTCACAATGCrCTTAT 
AAATATTTAACAAAACATAGTTGTTrACATACTTTCCCTTTAATCACAGAGTT 
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AGmCAAAAACTACATGATTGTTTCTTTGCATATGATATGGraGTATTTATG 

TCCCCACTCAAATTTCATGTGGAATTGTAATCCCTACTGTTGGAGAAGAGGCG 

TGCTGGAAGTTGATTGGATCATGAGGCCGACTTCCCCATTGeTGTTCTTGTGA 

TAATGAATGAGTTCTCATGAGATCCGGTTGTTTAGAAGTGTGTAGCACCTCCC 

CTTTTGCrCTTTTGCCTCCTGCTCCAGCCATGTAAGAT • 

TGCCTTCTGCCATGATrGTAAGTTTCCTGAGGCCTCCTCAGCCATGCm 

ACAGCCTGCAGAATCATGAGCCAATTAAACCTCTTTGCTTTATAAATTACCCA 

. GTGTCAGGTAGTTrCTTACATTTAATAGCAATGCGAGAACGGACTAATTCAGC • 

•ATATTATCrCACTGCTCTAAAATGATCCtAACCTACAtGCAGGAATGGGTA 

GGCrCCrrAAACAAAAATGGAGTTATATATQTTAGTTCTTTTGCT 

•GTTACAGTGTGGTAAGTATTGCAATTGGAGCATTGCATGTGCTATAATCCAAA 

CACGTGCTAAGTGACATAAGTATCAACGAGGGAGTAACAGGGATGGACAAA " 

AAG<3GACCAAAtCCAQCTATAGAATACTTTCCrGACTAGAT6AGGAATA^ 

TCAGATTTGAAAGATGTATAGTCATTAGCTAAGeAAAGGAAAGGAAAAGAA ■ 

GAAGTGGTnTTAGGCAAAGGGAACCACATGTTCTAAAGCCTAGAGGACTGAGG 

GATCATGGTGCATAAGAAGAGTATGTATATAGGGTAGAGTGAATGATAACAT 

GGGCACCAGCCAGACTATGGGAAGTCATGCTTGAGGTTTTAGACTTTATCCTT- 

ATGGGGATAAGAAACCACTGAAGAGTTGTAGGAAGAGACGTACGATATGAT 

caaatttggairrctgaaagttcagttcaccatagctataaagtgaagaatg 
gatgggtggaagggagtatgcctgaggatcaggagactatttaggagtctgt 

GTTGTAGTCTTTGTGAGAGAAAAtGGTGGCCTAGATTATGATGGTGGTAATG 

aggatgaagaaagatagatacatgtgagaggattgaattgacaggacttgg 
tacattattggaaagaaagatgtcagamttacttatgtttitctgcttaagc 
Attttgggtggacatggagccattcaa^^.ggtagttaccataaaaaacttgg 
tgtaggataggatgctcagtttcagacctattgggttccagcagcaattctca 
acctaggaaagttcttaaaaataaaataaattaaatctgtgagttattagtg . 
atactaaaatcataatcaatttgtaaccatrccatttatcctttrcitctcot ■ 
cttctgctccttcttcatcattgtcctcatctccttctccttctactattgtttt 
attcatgttggggttccatagactatragtagctgattatttgtgggaaggga 

AGGTATTTGAGTGTCnTACrAATTrTCITlTATTAGG™ 
GCAGlrmGTTAATATTATTmCrrGACATATmCCAGA^ 
CTGCTAAATTTTTCAGAACATTTGTrAATCTGAAAATATTAGATTAACTCTCA 
TATGACAATTTATCTGGTTAACAAATTTTAGTTTGAGAGTTCACCCAACATTT 
TTAAATATrCTTCCATTATCTmAGTATCTATTCCTGATGTTGAGATGTTTGC 
TGTCTGCCTAGTTGTTATCCCITGGAGATACAGTAGCCrrTTGATATATGTGGA 
•TAATTGGTrCTAGGTCTCCCAACGTAACCAAAATTTGCATATACTCAAGTGCT 
•GTATTCAGCCCTGCAGAACTCGTGTATACAAAAAGTTGACTCTCCATATATGC 
•AAGTITCACCTCCTGCAAATACTTCATTTTTGGTATATTTTCGATCTGCATTCA 
TTTGAAAAAAAATCCATGTAAAAGTGGAACCTTCCAGTTCAAACCTGTGTTGT 
. TCCAAGGGGCAACTGTAATCTCTCTmCTTTGTGAAAGCATTTAAAATTrAC 
TCCTTATTGTGCTITTTCTCn'AGTTTTGCCACAATGAGTCTAGGTATGAAm 
TTTTTATTTCCTCCTTGGGACTTTTGTTTCTTCAATGTAGAATCATACCm 
TATTGGAAATATTTTTGGTATTATGTCTGAATATTTCCCCTITCCCATATTTTT 
• TCTATTCTGTACCTCTITAATTCCTGTTAGTrGTATGTTGTACCTTTTAAATCCT 
GTCCTTCGfTATCTCCTAATTTGGTTCATCTTTTCCATCTTTTTGTATTTATGTGA 
TGGAtCCTGGACAATTTrCTGAGATATmCTTTCATTCCCCTTCCCAGCTGiT 
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TCTAATCTGCTAtTAAmGTCCATACAGTTCTCCATTtCAGTGAi^ 

TTTTCATTTCTAGAGTTGGATATGCTTCnTmcAAAATTTGCTA^ 

.CATAATATTGGATTTTTTCATTATTATTrrCAATCATTAT^ 

MTTTTAACCATATTTATGACCCCTtAATTITGTTATATrATCTGAA^ 

GGGTGCTAGTTCTTTTATTrGTTCATTTGCACACTCTCTGrC^^ 

.mCTTATGTAGTTGATCirrmATCTGGGAGTTTATCTTC^^ 

ATTTmCTAGTGACAGTTCCTTGGGCTGTGGTTAATGAAAGAGTCCCTACAT 

AGTTTCAAATTAGATTrrGCTGTGTCCrAGTTGTTTCAATGGCCTTCAAAC^ ■ 

TmACATTATCATCTCAGGTTAGGGCmTCTCTTAGGTTCGTAATATAAA 

TGCATCCTAGACCCATGGCACAAAACTTAAGCACAGGGCTTAAATTTTGATG ■ 

CCTCAAGTAACTTTTGTTTGTTITCCATCCAAAGAGTTGGCTAGAAGCAAACT 

TTCTTAATATTTAATTAAGGCnTTGGTATGCTto 

GATCAGGCAGTTCTTTAAGATATCAGGCTTTTGATGATACCTGGAATCAAGTT 

.CTAGOTCTTTACAlTCTGTAGGCAGAAACCrCATTm 

ATAAGAACTCAGCAGAtCTACACCTGCACnTATTCCTCAACATCTCCTGGCTT 

CATTtCTTTGCTTTGATTl'CCTTTTCAArrCTGGGATTGG 

aTATAAATTTGACTATGCATTAAAATGTtTATTTTGTGACATm 

AmCAATGGTTTTGTAGCAGCCAGGAAAGTCTAGCTCTCTGTTATGATrCTr 

TTAACTTGTTCAGTCCTGGGTTGTGGAGGTATGAGGAGCTATCCATCACACAT 

GGCAATTTCAAAGCAACATGCAAAATACAATAATATAAGCGGAACACATGA 

MGCTGGAGGACAGTGAGAGATTTTTGGTTTTGCATAATAGCTTTACTGAGA 

CATAATrAACATGCCATACAACTCACTCATTTATGTGCAACTCAATGGTTTTT 

GTATTTACAGAGtTGTACA^TTATCACCACAATCTTAGAAAATTTTCATTACC 

CCCAGAAGAAACCCCATATCCATTTGTATTCAGTCtCCATTTTGTCCCATTCA 

CTCCCTTTAGCCCTAGGCAACCACTAATCTACTTtCTGTCTCTGTAGATTTGTC 

TATTCTGAACACTTCATATACATAAAATTACATTCTATGTGGTCCTCTGTGATT 

GGCTCCTTTCATGTAGCATAATGTTTTCAAGGTTTATCTATGTAGCATGTATC 

AGTATTTCATTCATTTTTATGGCCTAATAAAATCATGCACCACATAATGACCT 

•TTTGGTCMTGACAGACCTCACACtrGACAGTGGTCXJCATAAGATTATAATGC 

AGTTGAAAAATTCCTATTGCCTAGtGACATCACAGCCGTTGTAATATCATACA 

GTCATAAAGCAACTCATTACCmTrCTGTGTTTAGGTACACACATATTTACCA 

TTGTGTTATAGTTGCCTACAGTATTCAGTATAGTAACATGCTATACACGTTTG 

TATCCTAGGGACAGCCGGCTATATAGCATATAGCCTAAGTGTGTAGTAGGCT 

ATACCATCTAAGTTTGTGTAAGTACACTCTATGATGTTCACACAATGATAAAA 

ttacrcaatgattaaatrcttagaatgtatcccaattgttaagtgacatatga 

ttgtattccaitgtatggctatcctatattttatttatgcatga^^ 

gaacatttgtgttgtttccacttattggttattaagaaacatgttgctctgaa 

catttgtgtacaagtttctatgcgagcatatgtmcagttctttgggagata 

tatttagcagtggaat ggct gggtcatatggtaattctatgcrraaccattrt 

gggaactgccagactattttecaaaggagctgcaccatttaagattcctatca 

acagtgtatgagggttccaatttcrccatatcctggacaacacttattatctg 

tatattttatmggccattctaatgcatgtgaagtggtatctcattgtgactt 

tgatogcatitcccrgatggctaatgatattgaccatcitgtcatgtttattg 

GCCATTTGTATATCircfrTGGAGACATGTCrAATCAAATCCTTTGCCCATTTT' 
TAAATTGGCrrATTTGTITITGTTAATrATTGAGTTGTAAGAGTTCTCAGAAGT 
CTTATGTTTAAGACnTGCAATFAATATATTTAAGATATAATCCCCTTAGATAC 
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ATAAmGCAAGCCTTTTTTTCCCCAlTCTTTG^^ 

: AATTGAAGTTCTCTGAGTTACAGAGAAAAGTCACAGAAGTAAAATATAATGG 
.GACTTTTGAAGTACAGA^^TATAAAGCCTCCTATTCAfCCATTTAAGTCGCAG " 
TAGA^^GATGACATTTTAGTAAATAAGACATACAAAATACAGTGCAATTTAT 
AACAGGACCAGTTGTGGAAGTGGAGAGAGAAAAAAAGGAAACAGATGAGAG- 
AGGTGGAAAGTTAAGAGGAGAGATAAATGCATAOTTCTGCCTTilTG^ 
•ACCAAGGATGGCAtCATTATATAACACTGGGTGGAAGCCAGGATACCCAGGA' 

• AGAGCAAGACTGAAAATAAATGGTAGGAGAGATTCAAGGAATGCCITGCrTT 
•ATATGCCTTTGQAGAGGCAAGGGTCTGGCCCAATGAGGTGGAAAGGGAGTCT 
.ACAAGGGAAGTGATCAACCAGAGAAGAGTGTGAGAGGGItCAGAGGAAAGT 
ATAGTGGTGGGATATrCGGAGCATTAACCAATGATGCAAAGTCTAGCCrCTT • 

. ATGTTGACAATAATAAAACAGCTTGGGTGGTAGGAGCTAGGTTTCCCAACCC 

; CAAACTGGAAATGGATTTCCTGTGCTGGGAAAGTTGGAGGGAACA<3GTGAGC 
ACCAATGTATTATCAAGTCTCT(3AGGCGTATATCCCCATTGGTTCTTTGTGTCC 
TAACACCATATGCTGAGCTCTGGGATGGTGGCAGCAAACCACCATCGATTTG 
TGGAATTTAAAAGCTGACAAAATCTGTTCGCCATCTTTTGGATAGAGCTTACA 

gtcacaagaccccaggaaagAcacagctgagtggtgtttctaagcaaacAta ■ 
gggctttggaaagttagagagactttaccatgtttggtgagcattaaaagaa- 
gaaattgctgcctgaagtatccaggggaataggtctctaaacagcctcacaa 
ttttccaccacccatggtagtggaaaaacagtataatgactttcaagctccca 
agggcagcatggaggtagggaaggaaacctgcatggcagtaaagctcaatc 
aggctgagacatgcttaggaaatcaagaatctgtggttgcagaatagaacat 
gaatattataaactgaggcaattttaaaaaggcggatggggagtggtgatga 
gaggaagagacactattcttttattcrcmttatatcaaatagtcaatgaat 
actgtacctaatgattoatttcaaccttaatgctcn-cctaattgtttcccttc 
aagttagacacctgtgttcactgctgtatrcttagcatccagaacagaactgg 
. ctggctcaataaatatttaatgaatgaacgaatccataggattcccacatctt 
agagtttttgtccacttttttaaaaaaattaatgttgcaggcttgattccctg 
ggaagcagactctaatagggagggtagtgtgcttagcatttatttaggatca 

ATATCTATGGAAGAAAGAAGAAGGAAGCAGGTTTGGGTAGAGGTGGAGGTT 
GAGCTACAATGCAGGTTCAAGCAGAGTGTCCCCACAGAGAGCACTGAAGTGA • 
GGGAGCCCTTCACAGlTGTCCTAATATTGCeAGGCTTTCGTATTCAXGCATCG" 
ATTAGTCATTGACATGGGCTGCCACAGAAAGGGGCAGGTCTTGGATCAGGTG' 

■ ACCCTCTGCAATGAAACAGTCTTGAGGAGCCTGACAGCTGGAGGGAGTCTAC 
CAACAGCACTCCCAAAAGCTiGGGCAAGACATCTTTCATTGAAGAGGGACCT 

: TGGCAGCATACCATGGTGT.CCACCCCATGAAGeAACTATATCTTAGAAAACT 
TAAGTCCmGTCirACACCTTGGATCCAAACTTAGTTTtGTCTGACTCCAGA 
GCCCATAATCAGAATAGAGTTTCTCTATTGTACATGGTGATGTAAACTCAGTC 
CTAAACTGGCGAGGGTGCrGGTTGGACTTGGGAAGAGTTATTCTCAGTGTTCT 

; GGGTTAATGCAAAATGCAGGCCAAGGTGAAGGACCTTTGTGTGGCACTGACC 

caatccatcaagatgagttcatgtcctttgtagggacacggatgaagctgga 

AACCATCATTCTGAGCAAACTATCGCAAGGACAAAAAACCAAACACCGCATA 

ttctcactcataggtgggaattgaacaatgagaacacatggacacaggaagg 

• ggaacatctcacactggggcctgttgtggggtgggggaaggggggagggat 
agcatttggagatatacctaatgttaaatgatgagttactgggtgcagcaca 

• ccaatatggcacatgtatacatatgtaactaacctgcacgttgtgcacctgta 
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cccftaaaacntaaagtataataaaacaaaacaaaacaaaagaaatacccta 

tgctccactcagctgggaggcctgcacagcagtgatctgtattgaatgatat 

gattaactgataattgctactgtaactaagattacagtttgacattgctgcgc ' 

acctgcctttctaccagggtggtagttcactaaaatattatacaatcagataa . 

.aatgaattaaactcagcttgaattgtagagtatattaaagtgattcaaattat • 

"ggccataagatattgattagatcAgtagtttgcagaatgtggcccccaccca ■ 

atagaaacaacatcatgtaggtagtttatcaaaattgaaatttcctggtccac. 

catatacttactgaatcagaaaegqtggtggtggtagggcccaggtgattct 

aatgtatggtaga'gtitgaggaatactggaictacatgaaaattagtaggaaa'. 

aaataaaaattgggtttataattagcaaataaatgttaaataaatgattcm 

taaaaactatttgtaaaatgactactcaagtcaagaaatagtacattgccag 

caccttggatdcctgtttatcccttctcactcacaeccgaaagcaaac^^ 

gttttattttatgcttitaatatctaggmtacatccctgagtacaagtt^ 

aagtttgtgttaataagcaaaatagtaataaatattaaaaaatgattgaggc 

cacacgcagtggctcacacctgcaatccgagcactttgggaggcagagacgg 

..gcatatcacctgaggtcaggacttcgagaccagcctggccaacatgatgaaa 

ccccgtctctacttaaaatacaaaaaaactagccaggcttggtggcaggtgc 

ctgtactcccagctactcgggaggctgaggcaaaagaatcgcttgaacctgg 

gaggcggaggttgcagtgagccgagatcaatcgcgccactgcactccagcct 

ggccaacaagagcaaaactcgatcacacacacagacacacacacacacaca . 

cacacacacacacacacacacacacgattggacratttcctttactttgttca 

tagaacttgttitacataacggacctcaggctatccaaatgatcacaacttcc 

taattaggaAggtctgaattaatgaaggttcaagattgcctccttgggggct • 

aaxgtgtatcxiaagctgcgacccacrggtaacagctttactatttactcttcc 

ctgccaggggattagtggaatctaaattgaacagttaggtatttaaaaccac 

tcatgtggttttaaccactaaaagggttgtraagc^^gtcttctttt^t^ 

tttrrgttgaaaatattataaaattggtgtctaaatgctgacagtcaatgggc 

aacitgggaaatractcaatgtctgtrttgggaaaacagtaactggcacttac 

ATATCACATGTGATCTGAGTAGCGTTAACTCCrrCTCTAGAGTACAGGATTGT 
ATGAAGTTGAGGTCCTACCCTTAATTCITTCACGTTGAAATATATAGTTACAG 
GAAAGTGAGTTTAAATTGATTGCATGTTACrGTCTCCCTTTTAGAGTCATTTA 
TTTTAAGAAGCACTGAAAGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAG 
CACTTTGGGAGGCGAAGACGGGTGGATCAGGAGGTCAGGAGATCGAGATCA" 
TCCTGGdTAACACGGTGAAACCtCGTC^C^AC^AA^U^TACAAAAAATTAGC• 
CGGGCGTGGTGGCAGGCGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAG 
GAGAATGGTGTCAACCCGGGAGGeGGAGCTTGCAGTGAGCTGAGATCGCGCC 
. ACrGCACrCCAACCTGGGCGACAGAGCAAGACTCCGTCTCAAAA^yuy^A^ 

• AAAAAAAAAiaCACTGAAAAAACTTGCGTAAAGGCATCACACTCTGTTGGGCA 
ATGGGGAAGGGGAAGGTGGGTTTCTCGCCCTCTGTCACTGATATGGTTTGGTT 

.GTGTCCCCACCGAAATCTCACCCTGAATTGTGATAATCCCCACGTGTCAAGGG 
CAC<KjCCAGGTGGAGATAACrGAATCATGGGGGCGGTTTCCCCCGTATTGTT 
CrCATGGTAGTGAATTAGTCTCACGAAATCTAATGGTTTTATAAAGGGCAGTT 

. CCCCTGCACAAGCrCTCTTGCCTGtCATCATGTAAGATGTi3ACTTTGCTCCTC • 
ATTTGCCTTCTGCCATGATCATGAGGCCTCCCCAGCCATGTGGACCTGTGAGT 
CAATTAAAGCTCirrCCTrrATAAATTACCCAGTrTCGGGTATGTCnTATTAG 

• CAGCATGAGAACAGACTAATACAGTCATTTACAAGCTATCTGGGGCTATAAG 
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AA;GGATATGCATGTGCTAGGTGAGGAATACTGCACAG(iGGAAAGAGtCTGA. ' 

GTGAGTAGGGCAGATrOTGCAATGTGAGGCTtTAGGGGATCCCAGGGGACAA ' 

GACTTGGGAGACACATTTGTAGAAAGGtTGGAGTTTTACAGTGATGCCACTG 

AATTATCAATGGAAGGATATGATTTCCTTCGTACAAGAAATCTGGAAAGTAT 

ATGGCATTTATAGACAGGTTTCCACTrCAAAAAATCAAGCAGCATATTATTTT- 

GCCAGTTTAAAAGTTAACCCTGCfrGCTTTTtGTm 

gacagtctcactctgtcacccatgcaggagtacaatggcgcgatctcggctc :. 

ACTGCAAGCTCCAACTCCTGGGTTCAAGTGATTCTCCnrGCCTCAGCCTCCCAA '. 

GTAGCTGAGACTACAGGTACCTGGCACCACACCTGGCTAATTTTTGt^ 

AGTAGAGGCAGGGTTTGACCATGTTGGGCAGGCCAGTCTCGAACTCCTGACC 

tcaaatcacccaccttggcctcccAaagtgctgggattacaggcatgagcca ■ 

CCAAGCCCAGCCTGGTTGCTAACrATTTGGATAACTGTTTGGAATCA.CTACAT '•• 

TCCTGAGTGAGTGATTCAGACAACAGAGGTTTTTTAAAAAAACtTrAAAAAA- 

ATTTTTATCTTAATAGTTTTITrGGGTACAGGTGGTTTTTGGT^ 

GTTCATXAATGGTGATTTCTGAGATTCTGGTGCACCTGTXAeCCACACAGTGT " 

ATACTGTGCCGAAtATGCAGtCTTTTTTCACtCACCCTCCTTCCACCGm 

CTGGAGTCCGCAAAGTCCATTATATTATTCTTATACCTTTGCATCCTCATAGCT- 

TAGCTCCCACTTATAAGTGAGACCATATGATATTTGGTTTTCCATTCCTGAGT 

TACTTCACTTACTTACAATAATGGCCTCCAGCTCCAACCAAATTGCTGCAAAA 

GACATTACmGTTCCTGTTTATGGCTAAGTAGTATTCCATGGTATATATACTA 

TTTTCTTTCTTTTTCTTTTTTTTT^^ 

CAGGCTGGAGTGCAGCAGCGCGATCTCCGCTCACTGCAAGCTCCACCACCCG 
GGTTCATGCCATTCTCCTGACTCAGCCTCCCTAGTAGCTGGGACTACAGGCAC 
CCACCACCACGCCCAGCTAATTTTTTTGTATITltAGTAGAGACAGGGTTTCA 
.CCGTGTTAGCCAGGATCGTCTCAATCTCCTGACTTCGTGATCCACCCGCCTCG 
GCCTCCCAAAOTGCTGGAATTATAGGCGTGAGCCACCGTGCCCGGCTATATA 
CTACATrTTCTTTATCCACTCCTTGGTTGATGGGCACTTAGGTTGGTTCTGTAT 
TltTGCAATTGTTAATTGCAGACAACTGAGGTTTrAATGAAGATTATAGTGTT ' 
GAAGTAGGAtATTTCTAATATTCTAGTCTTATGAGGACTTATAAAATTGGGTA 
GATTATCAAATCCTGAAATTACGGCATATTCATTTTGGCITATATTTAAAATA 
. TTCCACCATCAA^jACTGGGGAAAAAAGTTGATCAGAAACATACGCTGATATT. . 
.TGGCTATATTGTTTGTTTTTGCATGCATTTATGCAATAAACAAACATCTGATTT 
CnTGCAGAGTCCCTCAGATATTCTCCTCACATTAAAGATTCCACTTACTTATTC 
■ TGTGATTTCrCTTATTCTATGAGACAAAAATACAACAGAATGTCAGAAGAGC 
GAGCTGAAAATATTCCATGTGCAGAAATTTATTTTAAATTTTATTGCATCACA 
TrATACAAGCATTAATCATGGCTTCATATTGATGACTATTTAAATGTGAAAAT. 
TCACTCATGTCAGTACnTmGGCTATTrACAAGTAAGGAATITCTATGTACTT' 
TATATATCTCTGTATTTGTATGTACATATGCAGGAATACATGACTATACATAT 
GTACACACAGAAATACATCTATGGCCATACACATAGCTATAGATATGTCATA 
TATAATAATCTTCTCAGAAAGGTCTAAAATTAAAAAAAAGGAAAGAAAAAGT 
TGTAAACAGGGTCTTGTCTGTGTTGTTGGTGACATGGAAtTAGGACCATATGA 
TGACATTCTAGGAATGTGTGTCCATGTGTCTGAATACCCTTTTTAGCCCAGTG ' 
TCTGTAAAATTCACATGGGATAGAATGCAAAAAAGGTAGCAAACAGGCTGA 
AAAACAGGCCCAGTATACAAGTTCCCCTTGATTTTAAAAACTTGAGAATGTA 
ACTGCCCATAATGTGCATGCTTGCTTGGTCAAGAATGGTCTATGGATAGATAG 
CAATTGTGCGCTGGACTGCAACGTGATTTCACAGCAGTGTGACTCCAACGCC 
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AGTCCTCrACTGGATGCTGCCCCTCCTGCGGCCGCCGGTATCTGCAACATGCA 
CTGTGGTTGCATTTTTAGTCATTCACTTGAGTGTGCTTTTGCACAGCC AGAAG ' . 

aagtaaacagaattcaggtccttggcctggaaagggactattttctggaaag 

agaaaagaaggcataaagatcttatgcatacacaattgctttaaaacatgag 

gtgcaatagtttgtacttaccatgcacg^ilaggcattcaaAgaatttgtaacc. 

ccaaatggagtcagggatggtataaacatttttatgotgctttatgggtgaa' 

cagactgagtcagaagggcagtctaaagcaacatcacatcctgccaactatc 

.ggggaaatagaaacaggcaggcagatactatgcrttcaagcaaaaattaga 

aatggccntittgggaagaaagcagccagtagtggtggggccagaagaaatt 

cttcatcagmctggtcctgcagcccaggagagccctgttgcaccttggtct 

ctggccgctgaggctttcaggaagaattctctttgctaatgtctgcctcccac 

CCACCTAGAAAGAAGGGGCATGTTTCCCTGGGGATATTAAAGAAAACAGCCC 
GGCTGGAGGAAAAGAGAGAGGACGCTTTGTTTCAGGGAACTCCTAGGCCAAT' 
TCATGAAGACCTTGACTGATAGGCCAGAGTCTGGCAAGAAGGCAGAGAAGT 
•GTTGCAGAAGGTATGTGGGAGGGGAAGATCAAGCAAAATTTGCAACGATTA 
AAGTAAAAAACAAGCTTCTTTTGGATTGGGAAGCTTCTAAGGAGTTTATm 
ATCCCTGTTACTGCAACAATGGTAAGCTTTGOTCAGGGAGTTTAAAGCACCC 
ATTCAACCTCGGAATGGCCTCAGACGCTGAGCACACCTCTCACCTACATGGTC 

ACGGCCACTCGTTGTGGTGTGCnTITCCATTCTTCTTCCTCrrCCTTTCGTrCOT 
CrGGAGACGTTCCAGTTCTGCTTCATACATCATCTCCTGAATCAAGTACTTCT • 
CTCTTCTCATTCGATCCCTTAGGTCTTTTGGGAGGTCTGGGATCAGATATGAA 
ATGAGGTGCTTTATACAAAACACGAGGTGCTAAAACAGAGGAGAATGTAAC 
AGCGTCAGGTCTACTGTGCATCCTGAATGAGGTTAGAGAGTTGGCTGAAGAA 

ggatgtggaccctctggggtatgtggagagtgcaaagtgaccaggggaggc 

■ gtgtgtaatgcaagatgataggacaggtgaggcctgaactggtctgatggca 
ccacctaaggaatagaaagaggagtctggaaggaaggtgggaaaagaatgg 
tcagataattaattcaaaggagagccagaagaatgtattctgtggagtcaga 
aaacccaggttgaagctggtagctttcactgcttaagggctgtctgtctacct 
gaggatgaattatttaatltatctga<xctcattcccctcatctataaagtg 
gfattgtgagaattatgagaaaatgtttaaaaatgccttgtagactgtcaag ' 
tgctttacaaatct tggtta taagaattatcagtggggctagggaaatgatca- 

mCAiSGAAGAAGCriTlll'Cll'lll'GGTAAACAGTCACAAATTCTTrACTTGA 

AATACTAATAGCTGGAATTCTAGAGTCACTGTCACTrCTGCTCATCTATAfAA 

TGGTGACCAAGGAAGTACTAGATGACAATTTAAGGGCTATTACAATTCTAAA 

GTATTAAATAGAGACTAAATCATGTTCAAATGATACTGCTAAGTATTCTTCGT 

TTTATTTAAAACTTGATAAATTTGGTGATGCATGGATGGCACAAAATTAAGA 

agcititctgcatagttctcatattttcaaatagtraataacaaaatatta^ 

■agaaaatggAaaattctatttgaaaagccaaataaggcagccttttaaaAag 

tirrgtctgacagatttacagcagaattctaccagaggtacaatgaagagct 

. GATACCATTrCTATGGAAACTATTCCAAAAAATTAAAAAGGAGGGACTCCTC 
CfTAACTCATTTATGAGGCCAGTGTCATCCTGATACCAAAACCTGGCAGAGA 

■ • GACAACAAAAAAACTTCAGGCCAATATCCCCGATGAAC ATTGATGCAAAAAG 
CCTCAATACAATACTGGCAAACCAAATCCAGCAGCACATCAAAAAGCTTATC' 
TACCATGATCAAGTTGGCTTCATCCCTGGGATAAAAAGTTGGTTCAACATATG 
CAAATCAGTAAAeATCACATAAGCATAAGTAATTCATCACATAAGCAGAACT 
AAAGACAAAAA(XACATGATTATCTCAATAGCTGCAGAAAAGGTCrrTGATA 
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ACAATCCAACATCCCTTCATGTTAAAAATTCrCAATAAGCATTCCCCTTAAAA- 

ACCGGCACAAGACAAGGATGCCCrCrCTTACCACTCCTTTTCAACGCAGTATT 

(3GAAGTTCTGGCCCAGGCAGTCAGGCAAGAGAAATAAATAAAGGGTATTCA ' 

AATAGGAAGAGAGGAAGTCAAATTATCTTTTtTTTGCAGATGACCTGATCCC 

GTGTCTAGAAAATCCCATCAtCTTGGCCCAAAAGCTTCTTAAGCTGATAAGCA 

ACTTCAGCAGAGTCTCAGGATACAAAATCAATGTGCAAAAATCATtAGTATT 

CCTAACCACTAACAACAGGCAAGCAGAAAGCCAAATCATGGATGAACTCCCA. 

trCACAACTGCTGCAAAAGAATAAAATACCTAGGAATACAGCTAACAAGAAA . 

AGTGAATGAOTCrrCAAGAACrACAGACCAcrGCrrCAAGGAAAT^^ • 

GACACAAGeAGATGGAAAAATGTTCCATGCTCATGGATAGAAAGAATCAATA ' 

TTGTGAAAACGGCCAtCrGCCCAAAGTAATTTATAGATTCAATGCTATtCCCA ' 

TTAAACTACCATTGACACrCTTCACAGAATTAGAAGAAACTCTm 

ATGTGGAACCAAAAAAGAGTCCAAATAGCCAAGACAAGACTAAGCAAAAAG 

•AACAAAGGTGGAAGCATCACACTACCCAACTTCAAATTATACTAAAAGGCTA 

. CAGTAACCAAAACAGCATGGTACTAGTACAAAAACAGACACATAGACCAAT • 

ggaacagattagagatctcagatataagaccacacatctacaaccatctgat 

ctttgaaaaacctgacaaaaacaagcaatgggggaaaggattccctacttaa 

taaatggttttgggagaactggctagccatatgcagaaaatcgaaactgaac 

cccttccttacaccttatataaaaattaactcaagatggattaaagactta^ 

tgtaaagccccaaactataaaaatcctagaagaaaatctaggcaatgccatt . 

caggatgtaggcatgggeaaaaatttcatgatgaaaacaccaaaagcaattg 

caacaaaagaaaaaattgacaaatgggatctaattaaatgaaagtacttctg 

cacagcacaaaaaaactatcatcagagcaaacagtaacctatagaatggga 

gaacatttttgtaatctatccatctaacaaaggtctgatatccagagtctaca 

aggaacttaaacacatctacaaaaaaaataccaagcaaccccattaaaaagt 

gggcaaaagacataaacagacattcttcaaaagmgacattcatgcagcca 

acaaacatatgaaaaaaagctcaacattattgataattaaagaactgcaaat 

aa^w^tcacaatgagataccatcttacaccagtcagaatgttgattatttaa 

aagtccagaaacaacagatgctggcaaggtttcagagaaaaaggaacactt 

ttacactgttggtgggagtataaattagttcaaccattgtggaagacagtgt 

ggcaattcctcagtgatttacaagcagaaataccattttacccagcaattcta 

taacttgtatggcaaaggacacaaacagatacctctcaaaagaagatataca 

agcagccaaaaataatatgaaaagatgctcagaatctctaaggagattaga 

gaaatgcaaatcaaaaccacaatgaaatattatctcataccagtcagtatgg 

CGCTTATTAGAAAGGAATATAAATCATTCTATTATAAACATACATGGATGCAT 

atgttcattgcagcactattcacaatagcaaaggcatagaatcaacccaaat 
gcccatcaatgataggctggataaaaaaatgtggtacatatacaccatgaaa 
tactatgcaaccataaaaagggatgagataatgtcctttgcagggacatgga' 
cagaactggaagctgttatcctcagcaaactgacaaaggaacagaaaacca 
aataccacatgttctcacttataagtgggagctgaatgatgagaacacatgg 
acacatggtgggaaacaacacacaaaggggcctgtt'gggggtgggggtggg 
ggaagggagagcatcagaaagaatagctaagggatgctgagcttaatacct 
gggtgacgggttgatctgtgcagcagatgaccgtggcacacatttacctgtg- 
taacaaacctgtatgtcctgcacatgtaccctggaacttaaaataaaaaaat. 
• cccaaacaaacaacaagaagttttgtctgaaaaattttaagacagtgatggg 
ttaaaaatatcttctttaaagagagagtgtgccattggtaagataatttccag 
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GGAAGAGCAGCTTMTATTTTCTTCITITGGTTCCCT^^ 
CAGTGATAGAATAGTCTTGAAGAGGTCTAGGAAACTTTAGATCCAAGCCGAG 
AGGCAACOTGGCrmATTAATCTGGCTTTAATATATGTGACAA^^^ 
ATTTCCAOTATGACTAGAGTCATAGAAATGCAAACTATTmACAGC^ • 

totaaagcctgaaagaaaatagataatatatititattgacatAtaata^ 

ACAATTGCTAGACAGATTITACmAtAAlTCCATmGAGTCTrAGCTA^^ 
AATACTIACTTGGATGCTTGAATATAAATAATTCCGTGATGATACACAACCAG 

aaataccacttgtaatagccagtcctattggaaaatgcttatgatcatgltgg 
gaaatcctgataactatttgaaaatgataAattaactatatgaatcagtatat • 

TTCAATATGTGTGGTAATGATGATGGCAATAATTGGAGACATTTAGAAAGAG 
TGAATCTCCAACTTGACAAAAACAAGCAGTGGGGGAAGGATTCCTTGTTCAA • 
TAAATGGTGCTGAGATAACTGGCTATCCATATCjCAGAAGAATGAAACrGGAC 

ttctacctatcatcataaacaaaatttaactcaagatgaattaaagacttaca 

TGTAAGACCTCAACTATAAAAATGCTAGAAGAAAACCTAGGAAATACCCTTC 
TCAATAGTGGCCTGGGCAAAGAATTTATGGTGAAdTCCTGAAAAGCAATTGC 

aacaaaaacaaaaattgctaagtcaaatctaattaaagagcttctgcacagc' 

.aagagaaactgtcaaagaagtaaAcagacaccgtacagaatgggagaaaat 

atttgcaaactatgcacccaataaaggtctaatacccagaatctgtaaggaa 

CTTAAACAAATCAACATGTAAAAAACAAATAACCCCATTAAAAAGTGGTCAA 

AGGACACAAACAGATACTTCTCAAAAGAAGATATAGAAGCAGACAACAATA 

ATATGAAAAAATGCTCAGAATCTCTAAGGAAATTAGAGAAAT.GCAAATCAAA 

.ACCACAATGAGACACCATCTCACACCAGTCAGTACGGCTTTTATTAGAAAGT 
CAGGCCAAGTGCAGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCAA 

.GGCGGATGGATCATGAGGTCAGGTTAAGACCAGCCTGGCCAAGACAGTGAA 
ACCCCGTCTCCACTAAAAATACAAAAATTAGCCAGGTGTGGTAGCGGGTGCC 

tgtaatcccacctactcaggaggctgaggcagagaattacttgaacctiggga 

ggcagaggttgcagtgagccgagattgtgccattgcactccagcctgggcga 

cagagagagaatcagtcttaaaaaaaaaaaaaaaagtaaacaaattaacag 

atgctggcgaggcttcagagaaaaggggacacctgcacactgttggtggaaa 

.ggtaaattagtccaactactgtgtagtctggagatttctcaaagaacaaagg 

gcttaactaccattcaacccagcaatcccatttctdggtatatactcaaaaga 

aaataaagcattctaccaaaaagacacatgtactcatatgtcaatcacagga 

ctattcacaatagcaaagacatggaatcaacctgggtgcccatcaatggtgg 

accagatagagaaaatgtggtatgtatacaccatggaatactatgtggccat " 

aaaaaagaatgaaa.ccatgtccntgcagcaacatggatgtagctggaggcc 

ATrATCTTAAGTGAAATAATGTAGAAACtGAj^CTAAACACrrCK:AAGTTCT. 

TATTGTACGTACTTAGAGTGGAAGCTAAACACTGGGTACACATGGACATAAA 

GATGAGAAGAACAGACAGTGGGGACTACTATAGAGGGGAGAGGGGGAGGGG 

ACAAGGGCTGAATAACTACCTATTGAATACTATGCTCGCTACCTGGGTGGTG 

GGTTCAGTCGCACCCCAAACCTCAACATCATGCAATGTAACTTTGTAACAAA 

CTTGCACATGTACCCACTGTATCrAAAATAAAAGtTGAAAAAGAAAAAGAAA 

AAAAGAATGAATCTTACTGGGCTGATGTTTTCCAAATGTTTCCAAATTATGGT 

cccagaatgcctatagtcagaatcattacaacgagctcatgaaaaacaagcc 
acagacatggtatgtcagaatatctgggagtgaggatggaaaaactgcattt 
ttgctaaagtttaagaactatggccccafiaaaattacatttaggagcaacga 
.gtrcagatitgtccattaaaaaccaactatggactttcaaatatgtcttggaa 
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AA^tCAAATGCTTCAATTAAGAAAAAAGATCAAAAAGAGGTTACAfAATATG . 

ACACTACTACTTCrGGAAATATAGATAATCACATTCAAAATCeCCATCACTTA. 
TITiTATrCATTACTCCTACTGAOTCATTATGTCAAAGGAAATAAAGAATTT 
GGCTGAGGAAACTTACCTCAAAGACAATGATAAAAGCTAATCGAGCAGCTAG 
GACATGCCAAAACTGCAGTGTGTAGCCATAGGGCACCAGTGAATGAGGCGG 

GTCACGGTAQTCCCGGtAtCtATAAAGACACAGAAAAATGTTGCATTCAAAC 
TATGATTTTTCTGATTCTCAAATCTCTTCCATTCTCATTATATAGTCCTCA^^ '■ . 
AATAGGAGGGAAATTTGAAQTTTCTCTATACTAGATATCTGTTGTTTTGCTTG 

tccagcatttttcttctgataaaataatatctcttctct 
tgggttgactaggacttccccctccAcctgcctgatctcagctcaaaagatga . 
gccagacctgatcagtgacatcatcaagatgtctgccagaactatcacttag' . 
tgaatgatcactaagcccacagaaggcaaaagcctgggactcctggtggcct , 

TGTATGGAGAGAGAGAGCCTGCTTGA<3AAGGAAGTCAAGAAAACAAAGCAG 

AGACAGCAGAGAGGCAGCATCCTGCTGACACTGCTTGAGGCCTGGGATCCGG 

CCATTACTGAAGCTCATATCACCCCTTGACTCTCCCAGCTAACTGAACAATTT 

mGTTGTTrAAGTCAGTTTACCCTGGATTTCTGCCACTTGCCACTAGAAGA 

TCCTTCCTAATGTATCCTCCAGCTATCACTTTTGAGTCGGCATAAAATTCCAC 

AAATGTTTGCCGAATACCTGTrGGGGACCAGGCACTATTCAGAGAACTrTAA 

tccatgtgactccattaaagtaatactttaataatacaacaagctctcataaa 
tccagtatgaagcccaagaactaaaacattaccaggaacttacatcaattgc 

TTCCCTTACCCTGAATTrCATTTTTATTTTTAACTTATATGT^^^ 
TGAATATTGTACTATTTAGTTrGTTTTTGAACTrTATAGAAAGGGTAeTATACT 
ATATGTAGTTTGCTGGGGCTTGCTTITmACTATrATGTCACTAAGATTCATT 
CACATTATTGCATGTGGTTTATTTTTATTGCTGTATAATAGACTCTAGTGTGAC 
TATATTATGGTTGTTTTATATAGCATATAAGTATACACTTACAGGCrGGTCAT 
TCCTATATTTTCTTrrCTTTTATGTTGAGACAGAATCTCGCTCTGTCACCCAGG 
CrGGAGTGGAGTGGTGCGATCTTGGCTCACTGCAATCTCTGCCTCCCAGACTC 
• AAGTGATACTCCTACCTCCCAAGTAGCTGGGACTACAGGTGTGCACAAAAAT 

GCCTGGCrAATTTTTTTGTATTTTTGGTAGAGACAGGGTm 
AGGCrGGTCTCAAACTCTTGACCTTAAGCTATCTACTCACCTCAGCCTCCCAA 
AGTGCTGGGATTACAGGGTGAGCCACTGCACCTGGCCTCTATATTTTCTTTGA 
. crrrCCCTAACCATGGAAAGCTTTGAAAAAGAAAGCTCCATCAATGATAGAC 
TGGATTAAGAAAATGTGGCACAGATACACCATGGAATACTATGCAGGCATAA 
AAAAGGATGAGTTCATGTCCTTTGTAGGQACATGGATGAAGCTGGAAACCAT 
CATTCTCAGCAAACTATCTCAAGGACAAAAAACCAAACACCGCATGTTCTCA 
TTCATAGGTGGGAACTGAACAATGAGAACACTTGGACACAGGAAGGGGAAC 
ATCACACACCGGGGCCTGTCGTGGGGTGGGGGGAGGGGGGAGGGATAGCAT 
TAGGAGATATACCTAATGTAAATGATGAGTTAATGGGTGCAGCACACCAACA 

• tggcacatgtatacatatgtaacaaacctgcatgctgtgcacatgtactctag 
' aacttaaagtataaaaagataaataaataaataaaaagaaagctccagtcc 

ACTCACTTCAGCAAATTCCCnTGGGTAAAAGTTGGCTGCAATGCTCTGTGTA 
CCACTTCTTTCTGCCTTAGAATACCAAATATTCTTCCCAGGTCATTAATGTATT 
. TGAGAAGGTGTTTTTAAATTmCTCTACATGTTTAGATATTTrCAGCCCC^ 
•GCTGATCTGGGTGCCTCAGCTGCTATATGACTGAAAACAAACCTGAAAATAT 

• TATTITAGATGATTTTAATTAGAAATTTTAAAATGTmcm 
AAGAAAAATGCTAAACAAAACAATCATAATTTAACTTAATGGTGCCAGCATT 
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.AAtAC:CTCCATTCACATrAATAATTTCTGGAATATACTTTTAT(^^ " 
GAGACAGGCAGGAAGAATTATGAGCCAGGAAGCAGAAGAGGAGAAAGACTC 
TGtATTTTTATOTATTTTCAAATGATTACAAAGTTTTTAACTCC 
TTGGCAtTTTATTTTGAAGCATTTCATACITAGGATCCrGTATATGTCTAACAA • 
TGCITGCTAAATAATTrCATGTACGAAATGGTCTGGACAAGCAAGTGGGA^ 
TCTACrGAAGA>^GTATCAATTGTTGCTTAATfGCTTGCTTCTCCACTAGACC- . 
GCAACAGAGAGCGGCACTAGATTATATTCCAGCCAGTCTTTCTCCCCATCTTA . 
(TTGTTACTCCATGTGGGAAAGGAAATAAGCCATCTGATGAGATGTGACAGC . 
CAATGAGACCTGGGAACTTGGCATGAGTGTltAAAGGGGTGGAGGGTATGCT ■ 
GTCACTGCACAGAAATGGGGAGCCCCTGGCTTCTCTCACCTTCACCTTCTGTA 
•AGATGAGGGACTGAAGTGGATCATtACAAAAGTGCCTATGTCAAAGCATCTA. 
GGACTCCATGGCTGTTCrTCCTTGTGGCTTrijATTACAG^ 
GTCATGATAGGGAATCTCCAAAATTCmGtmGTim 

gctccatagtttgaatgcagtggtgctatcatagctcactgcagccttgactg 

cctggacttaagtgatcctcccaacttggccrcctcaagtgctgggattacag 

gtgtgagcccacaaaattcttataacgtcaaagtattaagagaaaaaggtgc 

catctggactcttcaagaactgtgcacaagtaatctcattgagcttttagtga 

atggttttatcatttgtcagatgagataatattgcctaacttagaaggctgat 

gtgacgattaaatggagtaacagtttctggcacagagtaggtgctcaacaaa 

ttctccttgatgtacgtitgtatagatacacatcaagaggcacagtttaattc 

atttccttttagagggagagaaagaaagagagaatgagcacatgttgcaga . 

ggctactgtggtggaacctgaaatgaactgggcctctttccacaggcttctga 

■ gttgtttaacalttagaaatgtgatgtgatatactctgggcaagagagagaa 
gcagaacaaagacaagtgccctcaacagctgtcgctcaaccc.gcctggggat 
tgtcrgtccctgagtccctggagcagcggtataaccttcactactatattttt 
agttitrccctttaagcta<xagttctcaattctgactgcacct^ 
ttccactccctgacattctgtmaattggtitgggttgtatcccg^^ 
gaaggcttgaaaattcttcatgtgctrttaatgcattgttgagattgacagtc 

■ actgctttaaacagactcagccagaaggggccttcagagtaaaacctcccca 
cggtggagccagcccagctcaggcagagccacgctgggtgccacactcacct 
gcagtacttaagaggagtccccgagaactcactgccatcagatrcaggctca • 
gatcggttctcaaagtcagaaattcgaaatacagacaagctggcattcacat 
agccaaccatgcacctataagaggaggcacattctaccattagaacactcat 
caccttctctgtatccagttccctgaagaccctgcttgccagacgtCcctgca 
attgtaaatgcctttaactggcccttcc^gtcaaccag.gactcaggttggga 

TACGCTGGAAAGCCCTGGGCATGinrGAATCAGGAAC^GTTGCTTGTTtr^ 

TCTCATCCTTTTTCTGCAGAA^TCATCATATAGAACATCTCTACCCCAATTAAT- 

GTTTCAGTGAAATAGGTAAAAAGCAGCATTACATGATGGGCATAAATTGTAT 

. gtaagccacagcttcttttgttgatgatacacataaacacagttcctcattaa 

■ tggtttcatatatatttatagatattctcaaaatgttccaaAtatgattggtg 
tgcatatgatccccccaactcatatcccatgctacagtcctgttagactatat 
tccctgaaagcttctgatgttitcatgcctctgctrgttatcttttctgttag 
attccctacctcttccmacaagtggaaatagtatttatccrtcagcggcca 
actgaaacaccactcctactaggaagacaacctaatcactccagtcacaaca 
tctcactcctttccctctctctgtagtgccaccatacttacttagttcaccatt 
accagatcgcrtgtactagagttggcagcatatatgtatgccctctcctttgg 
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ACTGCAATTTmGGGAGCAGAGCAGGTATCTTAtoATTTGTGTATCCTATG • 

ACFrAGGACAGTCrTGTGCAGTGTAGGAATCCCATAAATATTGAATCGATCTG 

ArrGGAGGAAACTAAAGTATGAGATTeTATTGAT'CATTCTAGATTGAATGGG 

TAGATAAAATCAAGATTTAAtCATr.CrGTCAATAATTCCA'rTCAAGAAAAAG 

CCCATTTTAAA^i^GTATCATAGCAAGTAGCACTOTAGAAACATCr^ . 

AGTCTCTA>y^GAGTTTTACATATTGAACATTGTGGTTAACACATGCACTTACT 

GAAATAA.CAATGTTTGAAACmGGGTGAAATTAGCACTrAGAtACATCCTGT . ■ 

ATACATTmCTTTAAATAGTGTTTTACATACAAAAGCTTGGGTTTTGGGGTC 

AAATCTTAGTTCTGCrGGmcn'AGTCATGtAATATTGGCACGTTATATAACTT 

mACTTCTTCnTTTITrrriirrGAGACAGGGCCT^ 

GAGTGCAGTGGTATGATCATGGCTCACTGCAGCTTTGACCTCCCAGGCTCAA 
ACGATCCTCCCACCTCAGACTCCTGAATAGCTGGGACTAqAGGCATGTGTCA 
CCACACCCGATTTACGTAACTTCTTAAAGTGACAGAAGTTCCCACCTCATAGG 
. GTCCtTGAGGGGATtATATTAAACAATGAATGTAtAGCATGTGATAAGATCTT 
AGCCTATAAGTAAGATAAACATGAGTGGTGATTATTCATITrATTTATTTCTT . 
GCATCTAGCAGGGGGTCAATAAGTAATTACTGATTGAATGAATAAATCATGA 
tCTrCAGATTCTCTTCATrAACAAGTCTGGCATGATTATTCCAAGCTGCTGM ■ 

• CACTAAACCAAAATGATGACATGAACACAGTTCTGAGAACACATGGGTTCTG . 

• CATGTTATCGtGeTCAAGACAAATCACTGGCTTTATCATGGCCTTAAAAAAAA 
TGAACCAAGCTGAAGGACCCATTTGTCAAGTGACTCGTGGGTGAAAACAAAT 
GGTCAGGGAGTGTGTAACATGCCTTAATCCAACTGGCAGCAGGACTCCCCTA. 
CTGAAAGGGTAATTTGGAATGATGTCTCCCTTACCTAGGAACTGTCTAGACCT 
GGACATGAAAATAAATTAGTGCCTTGAGGATGATTAGCTGTCACAGAGAGGG 
AGTAA.GTTTTAGCAGAGTCrTTCTAGAGATCTGAGCTGTGGGAAGGGAAAGG 
TGITCrGGGAACTGGTGAAAAAAAAAGGCCCAGCACTGAAATGGAATATTCA 
TAACCCCTCCTGCACTCTTATCTCACAGACCAGCCCCGAGGCTTTATCAAGAC 
TAACGCrACTGTCCATTGGTrGTTCTGCTCAAAAACTCCATTAAGAAAATTTA 
AATTTAAATAAAACACCTGTTTCATGGACCTAATGCATGAATGACAGTCATA . • 
AATCCAAGTTGAAAATGTGTCiTTAGAGGGTGGTTACTCCTGAGTGCAAAGC 
ATTAAATGTGTACCTAGrACAGGAACACACAGTTATCAGCTGTGTACATCTTC 
CfrCCATATCATTTCCATAATGGGGGAAAAGATGCACGGACAGCTAGACGTT 
TCACTAATTCCTTTCACCTTTTGTAGACTGTTAGCTATTGTCAATATCAAATAG 
TGAAAGTTGTCATCAATrCAGGCAAGTGAGATrTATCACATTrTGCTGTATAA 
AATTCAGCTAGTGGCATGCAAGTGTGAACTCACTGACCCATTGTTATAGCAG . . 
GAAGGTCACAGGTAGTAGAGAGCCCATGTCTATCACTGGG'ITrACTGGCA^ 

• AATGCTATATACCCTCACTGTGACCCACCACCAGCTCTCAATGAGACAGAGG 
GAGATGAGGCTACACrGATGATCAGGGAGAATTGAGAAGTACCAAGAAAAA 

• AGTAACAGACGAAGGGmCCTGTGAATATAAAGTCAGTITTCTTAAAGGAA 

• TGGTAGATGGCCTCTGTAGGCACTTCCAAGAAAGATCTAGAAAATGTCTTCT 
ACAGGGrAGTGCCATITGGATTAGGATGCATATCTCTTCCATGTAACTTCTGA 
GAAGACCACACACrCAmAAATGTGAGAATTCTACCGTGAAAGCCACTCAC 
CACTrATAATTTTCAATCCTTTAATACAtAAGAAAGGGATGAGCCAGCAATC 
ACrGTACTACCCTAAAGGOTATGACmCTATACCnTGACAATTCTGCTACT 
OTATGAGTTCAATAATITCTAAAGCATTTCTAAATrGGAAAAATAAAAAAA 

■ ATCTGAAAAGfAATAGAGTGGCCAGTGAGGCAGAAAATAAGATTCATGAAG 
. AAACATACATAAATTTAACTACTTGAAAACATGAAAAATGACTAATGGATTA 
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GAAAACAAAAGGCAGATCCATAAATTGAAAGTGGCAGAAGGCTCCTAAGTC 
AAAGAGAAAGGTTAAAAAAGAAAGAAAGAAAAGAAAAGCreCAAAGAGCC 
•CAGTAAAATTGATACAAGGAGGCTATGAGATACACTTTAATCACTGTGGACA • 
•TrGTCrTAAATCTACCAGGTTGTTAAAATTCTAAATCTTTGATTCACT^^ 
•TGTGCATAAAGGTTGCCATmGTITrTGGTTCCCCCTAAAACAAGATAGATT • 
TCACCATATTTGAATTTGGGAAAGAACTGAAACACCATTACGACGGAAAATC 

cagggggtttactgttaagggacagcctcaagggctgtgaggtgcgggatgg- 
agagtgagtcgaggctccaccttcacatgggctcccagagttactatgccttg . 

TGATTCTGGACTTAACCrCrmAATCTCTTCCTGCTCCtCAAAGTGAAGArt^^ 
■ TAATGCAGATCCCACTCCCAGGGCTGTTTTGAGAATCAACTAAAAACACATG • 
TGAAAGTGGTTTGAGTGAAAGTAAGCTTTTTCAGGCTAAGCCAAGCAGAGAA 
. TGTCAACGATAGTTGCAATTAGAAAGAAATGCCTGGGCAAAATAACATGTCT • 
• TTACATTTCTCTTAAAATGGGATGGTATGCTCACTAAAAGTtGCTAGATAATr " 
AGTGTGTCATGGCCACATGCTrATAGTAGTCAAGAACTTTACAGAAGCCTTtr 
CAAATAACTGGATTGTGAGCTTrATGACATTCCTCAATCATCCACACCTTAGC • 
CCATTTATGTAGGATTCATAATGAATTGGACCCAGGCCAGGTTTGCATGAGTT 
CAGTTGATTTGCTTTGGTGGCACTTAACTCCTTGGGGTCGATCAGCTACAATG' 
GCTGTAGAGGATGAATCAATCTGGCTCAGGCTAGTAAACCAGGGCTGCTTGT 
ATAAGTAGACAGCAGCCCATGTAGGGTGATTTCTGGGTTATTAAGAAGACCC • 
TGTGATCCCCAAACACACCTCACTCTGACCTATAGACATCAAAATCCCATAAT 
AAGAGTTTAACTATmGGCTTTAACCAAACAGGGATITlTAAAGCATGT^ 
GACTTCTTGCCCTTTCTCCAAGGTACAGGAATGAAACTAGGGGATGTTATAAT- 
TirGACTTATCTTTATCTGACCATATATAGCGTGCCCTGGCTTTGCAAACCATT- 
GAGGCAATGCCATAGAATTGATCCTCCCAAGGAGTTTAGTCCACTCAGCTTG 
CTTGGGGTAATATAGGCATTGQTCAGGTCAATACCCATGAAAAAGTCATTCA • 
CTTACTTATTCCACGAATACTTATTGAGGACCTACTGGGTACTCAGTACCTAG 
TTCCGTATTTCCTATGTAAGATCTGCAGAGTTTAGACATAATGACACAATGTA 
AATCAGATTGTAACTGTCCTTTTGCAAACTCATAAAATGATATAACTATATAT 
ACATCATATGCrACTGTTTTGGTATACACATTAGCATATCACACATTrCTAAA 
TTCATAGTGGGCAGAGGTCAGGGGTGGGGAAAATGATTAATTGCCACTCTTA 
CATTGGATTCCATCTAAAACCTGTCTGACTTGTATGTCCrCCAACrCrTTGGTA 
GATGAGGACATGTTTTCTITGGCrGACAGTGTTATTATTATTATTTATi^ 
ATCATAATTCCACCTCCCCTATCACAGGCCTCTCCTTCTGGGCATGTATTCCA 
TATTCTCCAGTGCAGTCTTCTGTCTGAGTGTCCAACCTCAAAAAATGGCTAGG 
AAATAGAAACrGTATAGTGGTTrTATAGCAAACTrAGTTTTGCCCAGCTTCTC 
CrTGGCCTGCACAAGGTCCATACTTATAAGCATACACCAAGCGAGGGATAAA 
GTCAGATGTTATCGCTATGACAAATGCATTTGTGATAACAGAGAGAATTCCA . 
ATGCCTTCAAGAATTC 
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HuU mRNA Sequence from nonnal colon tissue sample 612N The .start rr^^^wi^ ■ ■ 
indictae4ingreenandthestopcodoninred. _ "^P'^'^^^^' start codonm . . 

GCGTTCGGAGGCCGACACCTGGAGGACGCCTCC^^^ 
. GCCTGCGCGCGAGGGATCCGGGAAGAAGTGdGSS^^?^^^ 
CGCCGCGGGGCCACCAGTTTGCGCGCAGGGCTCAG^^^ 
GACACGCCACGGGGCATCGGCACOTCGTGGTGTGGGAC^^^ 
CGGGCATGCTGGTCATCTCGGCCGCCATCGGCATCTACT^^ 
• S^SJ^S^-^'^^^^^^^^^^^^^C^CTGATGG^^^^ 
CAGTGCCCGTGGCGCTGTCCCTCACCGCTAGCITCAT^^^ 
.OraGGCACCCCCTCCGAGGTCTACCGTmGGGGCCATt^^ 

otcacctacttctttgtggtggtcatcagcgccSagS^ • 
ttgatctgtggggcgcggtagtggcaacgggggtStog^^ 

CACACTGGGTGGTCTTAAAGCAGITATCTGGACAGA^^ 
TCATGGTGGCTGGATITGCATCCGTGATrATAC^^ 

^^SiJi^^^^^^GTAAAAGCAGATTCCAGGSi^ 

ATCrrGTGGGACTCTGGGCAATCCTCACATGCTS^im " 
CTATATTCCAGGTACCATGACTGTGATCCTTGGACAGCCA^^^ 
ACCAGACCAGCTCATGCCITATITGGtACTGGACAlTCTGC^ 
GACrrCCTGGACTITTTGTGGCCTGTGCTTAC^G^C^^ 
TCCTCCAGTATTAATGCOTAGCAGCAGTAACTGTG^^Tci^^ • 
. ^CTTCAGATCGCTCrCA^AAAGGTCTCTGTcSGG^CCC^ 
GTGTGGTGTATGGAGCCCTGTGTATTGGAATGdS^^ 

GGAGCmGTTGCAGGCAGCACTCAGCGTATITGGTAT^^ • 

TATGGGCCTGTTCGCTITGGGCArmGGTTCCCmGCCAACTC^ 

CACTTGTTGGrCTGATGGCTGGATTTGCCAmCTCrATGGGSG^^ 

GCTCAAATATATCCTCCACTTCCTGAGAGAACATrG^^ ' 

•CCAAGGCTQTAACAGCACCTACAATGAGACAAAmGA^^^ 

ATGCCAmACTACTAGTGTnrPCAAATATACAATGTTC^ 

gatggataactggtattctitatcatatctgtacitcaJcact^ 

TGGTiyvCArrATTAGTGGGGATACITGTCAGTTrATCAAC^^ 

TTTGATATTTTTAAGAAAAAGAAGCATGTTTTGAGCTATAAATCACATCCAGT 
GGAAGATGGTGGANCTGAtAATCCTGCTITCAACCACATT^TG^ 
•GATCAGAGTGGCAAGAGCAATGGGACTCGTTTGraAGCTGCTCTG^^ 
GATATCCnTAAATGATGTTTCAATmATATGTITSL^GAT^ 
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atacacacatatggaS?gta^^S^^ 

GTATTATTGAAAATATma^^^rS^^^^ • 

CACCATTCrrAGGGAGCAATGAG^^ " 
TTmAtCCCCCTATC^rcCAG^C^^ 

•atcaggaaggtggtaatcaataS?a'?^^ 

MTGATCAAAGTCTT^^G^G^^ 
. TrrCTAGAGmCTGGGCmGOT 

mATCTGCATGCCCAGnTCTS?CTl?rItnJSSS^ 

AACTGAGCTTGTClTCATAATm 
' CAAGACACTCATGTG^^^SS^S?^^ 

GACTATAAATTATGAACTCcJ^ 

GGTATAmATCCAGSJSGG^^ 

taaaacgtctggactS^^S^^S^^^^ 
agcagtctcaaagitgctJagct^^S^^ 

TrGGlTATAGGAGTrSicAA^ 

ACTGTTTAAAAG^f^^S^f^A'JS^^^ 

ATtTAGATACATACAAATOrrmS^^ 

atacgtgttatitagotZS^^ 

AATGlTAAGAAA^m^ij^^KSS aII 
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MDTPRGIGTFVVWDYVWAGMLVISAAIGIYYAFAGGGQQTSKDFij^GGR 
AVPVALSLTASFMSAVTVLGTPSEVYRFGAIFSIFAFITFFVVVISAEVFIJVFYK^ 
GITSTYEYI^IJlEbKCVRLCGTVLFIVQmYTGIVIYA^ • 

VATGWCTFYCTLGGULAVIWTDWQIGIMVAGFASWQAVVMQGGISTI^ • 
YDGC3U.NFWNFNPl^IX2ianFWTinGGTIT^ 

KI^LYINLVGLWAILTCSWCGIArYSRYHDCI>PWTAKKVSAPDQIA^ 

IX^DYPGLPGIJfVACAYSGTI^tVSSSINAIAAVTVEDLIKPYBRSI^ERSI^WISQ 

GMSVVYGALCIGMAAIASmGALLQAAI^WGMVGGPmGIJfALGILVPFANS^ 

GALVGmAGFAISLWGIGAQIYPPIPERTIJimDIQGCNSTV^ 

PF]TSWQIYNVQRTPL]VroNWYSI^YLYFS.WGTLVTLLVGmVSI^^ 

DPRmTKBDFI^NEDlFKpaaiVLSYKSHPV^ 
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